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Abstract —This paper studies randomly spread code-division 
multiple access (CDMA) and multiuser detection in the large- 
system limit using the replica method developed in statisti¬ 
cal physics. Arbitrary input distributions and flat fading are 
considered. A generic multiuser detector in the form of the 
posterior mean estimator is applied before single-user decoding. 
The generic detector can be particularized to the matched 
filter, decorrelator, linear MMSE detector, the jointly or the 
individually optimal detector, and others. It is found that the 
detection output for each user, although in general asymptotically 
non-Gaussian conditioned on the transmitted symbol, converges 
as the number of users go to infinity to a deterministic function of 
a “hidden” Gaussian statistic independent of the Interferers. Thus 
the multiuser channel can be decoupled: Each user experiences 
an equivalent single-user Gaussian channel, whose signal-to-nolse 
ratio suffers a degradation due to the multiple-access interfer¬ 
ence. The uncoded error performance (e.g., symbol-error-rate) 
and the mutual information can then be fully characterized using 
the degradation factor, also known as the multiuser efficiency, 
which can be obtained by solving a pair of coupled fixed-point 
equations identified in this paper. Based on a general linear vector 
channel model, the results are also applicable to MIMO chanuels 
such as in multiantenna systems. 

Index Terms — Channel capacity, code-division multiple access 
(CDMA), free energy, multiple-input multiple-output (MIMO) 
channel, multiuser detection, multiuser efficiency, replica method, 
statistical mechanics. 


I. Introduction 

Consider a multidimensional Euclidean space in which each 
user (or data stream) randomly selects a “signature vector” 
and modulates its own information-bearing symbols onto it 
for transmission. The received signal is a superposition of all 
users’ signals corrupted by Gaussian noise. Such a multiuser 
scheme, best described by a vector channel model, is very 
versatile and is widely used in applications that include code¬ 
division multiple access (CDMA), as well as certain multi¬ 
input multi-output (MIMO) systems. With knowledge of all 
signature vectors, the goal is to estimate the transmitted 
symbols and eventually recover the information intended for 
all or a subset of the users. 
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This paper focuses on a paradigm of multiuser channels, 
known as randomly spread CDMA [1]. In such a CDMA sys¬ 
tem, a number of users share a common media to communicate 
to a single receiver simultaneously over the same bandwidth. 
Each user employs a randomly generated spreading sequence 
(signature waveform) with a large time-bandwidth product. 
This multiaccess method has many advantages particularly 
in wireless communications; frequency diversity, robustness 
to channel impairments, ease of resource allocation, etc. The 
price to pay is multiple-access interference (MAI) due to 
non-orthogonal spreading sequences from all users. Numerous 
multiuser detection techniques have been proposed to mitigate 
the MAI to various degrees. This work is concerned with 
the performance of such multiuser systems in two aspects: 
1) Uncoded symbol-error-rate (or equivalently, multiuser effi¬ 
ciency) and 2) Spectral efficiency, namely the total information 
rate achievable by coded transmission and normalized by the 
dimension of the multiuser channel. 

A. Gaussian or Non-Gaussian ? 

The most efficient use of a multiuser channel is through 
jointly optimal decoding, which is prohibitively complex with 
a large population of users. Although suboptimal, the philos¬ 
ophy of separating the tasks of untangling the mutually inter¬ 
ference streams and exploiting the redundancy in the coded 
streams has received much attention. A multiuser detection 
front end supplies individual (hard or soft) decision statistics 
to independent single-user decoders. With the exception of 
decorrelating receivers, the multiuser detector outputs are still 
contaminated by multiaccess interference, and their statistical 
characterization is of paramount interest. 

In [2], [3], [4], Verdii first used the concept of multiuser 
efficiency to refer to the degradation of the output signal- 
to-noise ratio (SNR) relative to a single-user channel cali¬ 
brated at the same bit-error-rate (BER) in binary (antipodal) 
uncoded transmission. The multiuser efficiencies of the single- 
user matched filter, decorrelator, and linear minimum mean- 
square error (MMSE) detector were found as functions of 
the correlation matrix of the spreading sequences. Particular 
attention has been given to the asymptotic multiuser efficiency 
in the more tractable region of high SNR. Expressions for 
the optimum (high-SNR) asymptotic multiuser efficiency were 
found in [4], [5]. 

In the large-system limit, where the number of users and 
the spreading factor both tend to infinity with a fixed ratio, the 
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dependence of system performance on the sequences vanishes, 
and random matrix theory proves to be a capable tool for 
analyzing linear detectors. The limiting multiuser efficiency 
of the matched filter is trivial [1]. The large-system multiuser 
efficiency of the linear MMSE detector is obtained explicitly 
in [1] for the equal-power case (perfect power control), and in 
[6] for the case with flat fading as the solution to the Tse-Hanly 
fixed-point equation. The efficiency of the decorrelator is also 
known [1], [7], [8]. The success of the multiuser efficiency 
analysis of the wide class of linear detectors hinges on the 
fact that 1) the detection output is a sum of independent com¬ 
ponents; the desired signal, the MAI and Gaussian background 
noise, e.g., the decision statistic for user k is 

(Xk) = Xk + Ik + Nk-, (1) 

and 2) the multiple-access interference (Ik) is asymptotically 
Gaussian (e.g., [9]). As far as linear multiuser detectors are 
concerned, regardless of the input distribution, the perfor¬ 
mance is fully characterized by the noise enhancement associ¬ 
ated with the MAI variance. Indeed, by regarding the multiuser 
detector as part of the channel, an individual user experiences 
asymptotically a single-user Gaussian channel with an SNR 
degradation equal to the multiuser efficiency. 

The performance analysis of nonlinear detectors such as 
the optimal ones is a hard problem. The difficulty here is 
inherent to nonlinear operations: The detection output can¬ 
not be decomposed as a sum of independent components 
associated with the desired signal, the interferences and the 
noise respectively. Moreover, the detection output is in general 
asymptotically non-Gaussian conditioned on the input. An 
extreme case is the maximum-likelihood multiuser detector for 
binary transmission, the hard decision output of which takes 
only two values. The difficulty remains even if we consider 
soft detection outputs. Hence, unlike for a Gaussian output 
statistic, the conditional variance of a general detection output 
does not lead to a simple characterization of the multiuser 
efficiency or error performance. For illustration, Figure^plots 
the approximate probability density function obtained from 
the histogram of the soft output statistic of the individually 
optimal detector conditioned on 4-1 being transmitted. The 
simulated system has 8 users with binary inputs, a spreading 
factor of 12, and SNR=2 dB. A total of 10,000 trials were 
recorded. Note that negative decision values correspond to 
decision error; hence the area under the curve on the negative 
half plane gives the BFR. The distribution shown in Figure 
^is far from Gaussian. Thus the usual notion of output SNR 
fails to capture the essence of system performance. In fact, 
much literature is devoted to evaluating the error performance 
by Monte Carlo simulation. 

This paper makes a contribution to the understanding of 
multiuser detection in the large-system regime. It is found 
under certain assumptions that the output decision statistic 
of a nonlinear detector, such as the one whose distribution 
is depicted by Figure [2 converges in fact to a very simple 
monotone function of a “hidden” conditionally Gaussian ran¬ 
dom variable, i.e., 

(Xk) ^ f{Zk) (2) 



Posterior mean estimate 


Fig. 1. The empirical probability density function of the individually optimal 
soft detection output conditioned on +1 being transmitted. The system has 8 
users, the spreading factor is 12, and SNR=2 dB. 



Fig. 2. The empirical probability density function of the hidden equivalent 
Gaussian statistic conditioned on +1 being transmitted. The system has 8 
users, the spreading factor is 12, and SNR=2 dB. The asymptotic Gaussian 
distribution is also plotted for comparison. 


where Zk = Xk 4- and Nj^ is Gaussian. One may contend 
that it is always possible to monotonically map a non-Gaussian 
random variable to a Gaussian one. What is surprisingly simple 
and useful here is that 1) the mapping / neither depends on 
the instantaneous spreading sequences, nor on the transmitted 
symbols which we wish to estimate in the first place; and 
2) the statistic Zk is equal to the desired signal plus an 
independent Gaussian noise. Indeed, a few parameters of the 
system determine the function /. By applying an inverse 
of this function to the detection output (Xk), an equivalent 
conditionally Gaussian statistic Zk is recovered, so that we are 
back to the familiar ground where the output SNR (defined for 
the equivalent Gaussian statistic Zk) completely characterizes 










3 


the system performance. The multiuser efficiency is simply 
obtained as the ratio of the output and input SNRs. We 
will refer to this result as the “decoupling principle” since 
asymptotically, after applying f~^, each user’s data goes 
through an equivalent single-user channel with an additive 
Gaussian noise which is independent of the interferers’ data. 

Under certain assumptions, we show the decoupling prin¬ 
ciple to hold for not only optimal detection, but also a broad 
family of generic multiuser detectors, called the posterior mean 
estimators (PME), which compute the mean value of the input 
conditioned on the observation assuming a certain postulated 
posterior probability distribution. Simply put, the generic 
detector is the optimal detector for a postulated multiuser 
system that may be different from the actual one. In case 
the postulated posterior is identical to the one induced by the 
actual multiuser channel and input, the PME is a soft version 
of the individually optimal detector. The postulated posterior, 
however, can also be chosen such that the resulting PME 
becomes one of many other detectors, including but not limited 
to the matched filter, decorrelator, linear MMSE detector, as 
well as the jointly optimal detector. Moreover, the decoupling 
principle holds for not only binary inputs, but arbitrary input 
distributions with finite power. 

Eor illustration of the new findings, Eigure |2l plots the 
approximate probability density function obtained from the 
histogram of the conditionally Gaussian statistic obtained by 
applying f~^ to the non-Gaussian detection output in Eigure[2 
The theoretically predicted Gaussian density function is also 
shown for comparison. The “fit” is good considering that a 
relatively small system of 8 users with a processing gain of 
12 is considered. Note that in case the multiuser detector is 
linear, the mapping / is also linear, and (0 reduces to Q. 

By virtue of the decoupling principle, the mutual informa¬ 
tion between the input and the output of the generic detector 
for each user converges to the input-output mutual information 
of the equivalent single-user Gaussian channel under the same 
input, which admits a simple analytical expression. Hence 
the large-system spectral efficiency of several well-known 
linear detectors, first found in [10] and [11] with and without 
fading respectively, can be recovered straightforwardly using 
the decoupling principle. New results on the spectral efficiency 
of nonlinear detection and arbitrary inputs under both joint and 
separate decoding are also obtained. Eurthermore, the additive 
decomposition of optimal spectral efficiency as a sum of 
single-user efficiencies and a joint decoding gain [11] applies 
under more general conditions than originally thought. 

As in random matrix spectrum analysis, our large-system 
results are representative of the behavior of systems of mod¬ 
erate size. As shown in Eigures Q and |2l a randomly spread 
system with as few as 8 users can often be well approximated 
by the large-system limiting results. 

B. Random Matrix vs. Spin Glass 

Much of the early success in the large-system analysis of 
linear detectors relies on the fact that the multiuser efficiency 
of a finite-size system can be written as an explicit function of 
the eigenvalues of the correlation matrix of the random signa¬ 
ture waveforms, the empirical distributions of which converge 


to a known function in the large-system limit [12], [13]. As a 
result, the large-system multiuser efficiency can be obtained as 
an integral with respect to the limiting eigenvalue distribution. 
Indeed, this random matrix technique is applicable to any 
performance measure that can be expressed as a function of 
the eigenvalues. Based on an explicit expression for CDMA 
channel capacity in [14], Verdu and Shamai quantified the 
optimal spectral efficiency in the large-system limit [10], [11] 
(see also [15], [16]). The expression found in [10] also solved 
the capacity of single-user narrowband multiantenna channels 
as the number of antennas grows—a problem that was open 
since the pioneering work of Eoschini [17] and Telatar [18]. 
Unfortunately, few explicit expressions of the efficiencies in 
terms of eigenvalues are available beyond the above cases. 
Much less success has been reported in the application of 
random matrix theory when either the detector is nonlinear 
or the inputs are non-Gaussian constellations. 

A major consequence of random matrix theory is that 
the dependence of performance measures on the spreading 
sequences vanishes as the system size increases without bound. 
In other words, the performance measures are “self-averaging.” 
In the context of physical science, this property is nothing but 
a manifestation of a fundamental law that the fluctuation of 
macroscopic properties of certain many-body systems vanishes 
in the thermodynamic limit, i.e., when the number of interact¬ 
ing bodies becomes large. This falls under the general scope of 
statistical mechanics (aka statistical physics), whose principal 
goal is to study the macroscopic properties of physical sys¬ 
tems from the principle of microscopic interactions. Indeed, 
the asymptotic eigenvalue distribution of certain correlation 
matrices can be derived via statistical physics (e.g., [19]). 
Tanaka pioneered the user of statistical physics concepts and 
methodologies in multiuser detection and obtained the large- 
system uncoded minimum BER (hence the optimal multiuser 
efficiency) and spectral efficiency with equal-power binary 
inputs [20], [21], [22], [23]. In [8] we further elucidated 
the relationship between CDMA and statistical physics and 
generalized to the case of unequal powers. Inspired by [23], 
Muller and Gerstacker [24] studied the channel capacity under 
separate decoding and noticed that the additive decomposition 
of the optimum spectral efficiency in [11] holds also for binary 
inputs. Muller thus further conjectured the same formula to be 
valid regardless of the input distribution [25]. 

In this paper, we build upon Tanaka’s ground-breaking 
contribution [23] and present a unified treatment of Gaussian 
CDMA channels and multiuser detection assuming an arbitrary 
input distribution and flat fading characteristic. A wide class 
of multiuser detectors, optimal as well as suboptimal, are 
studied under the same umbrella of posterior mean estimation. 
The central results are the decoupling principle for generic 
multiuser detection, the characterization of multiuser efficiency 
via a pair of nonlinear equations, as well as the spectral 
efficiencies of separate and joint decoding. 

The key tool in this paper, the replica method, has its 
origin in spin glass theory [26]. Analogies between statistical 
physics and neural networks, coding, image processing, and 
communications have long been noted (e.g., [27], [28]). There 
have been many recent activities applying statistical physics 


4 




Fig. 3. A multiuser system with joint decoding. 


Fig. 4. Multiuser detection followed by independent single-user decoding. 


wisdom to sparse-graph error-correcting codes (e.g., [29], [30], 
[31], [32], [33]). Similar techniques have also been used to 
study capacity of MIMO channels [34]. Among others, mean 
field theory is used to derive iterative detection algorithms 
[35], [36]. The first application of the replica method to 
multiuser detection was made in [23]. In this paper, we draw 
a parallel between the general statistical inference problem in 
multiuser communications and the problem of determining the 
configuration of random spins subject to quenched random¬ 
ness. For the purpose of analytical tractability, we will invoke 
common assumptions in the statistical physics literature; 1) 
the self-averaging property applies, 2) the “replica trick” is 
valid, and 3) replica symmetry holds. These assumptions have 
been used successfully in many problems in statistical physics 
as well as in neural networks and coding theory, to name 
a few, while a complete justification of the replica method 
is a notoriously difficult challenge in mathematical physics, 
which has seen some important progress recently [37], [38]. 
The results in this paper are based on the aforementioned 
assumptions and therefore the mathematical rigor is pending 
on breakthroughs in those problems. A set of easy-to-check 
sufficient conditions under which the replica method is justi¬ 
fied is yet to be found. In statistical physics it has been found 
that results obtained using the replica method may still capture 
many of the qualitative features of the system performance 
even when the key assumptions fail [39], [40]. Furthermore, 
the decoupling principle carries great practicality and finds 
convenient uses in finite-size systems where the analytical 
asymptotic results are a good approximation. 

The remainder of this paper is organized as follows. Section 
inigives the model and summarizes the main results. Relevant 
statistical physics concepts and methodologies are introduced 
in Section ID Calculations based on a real-valued channel 
are presented in Section EYl Complex-valued channels are 
discussed in Section|3 followed by some numerical examples 
in Section ED Some conclusions are drawn in Section IVTIl 

II. Model and Summary of Results 
A. System Model 

Consider the synchronous iC-user CDMA system with 
spreading factor L as depicted in Figure^ Each encoder maps 
its message into a sequence of channel symbols. All users em¬ 
ploy the same type of signaling so that at each interval the K 
symbols are independent identically distributed (i.i.d.) random 
variables with distribution (probability measure) Px, which 


has zero mean and unit variance. Let X = [ATi,... ,XkY 
denote the vector of input symbols from the K users in one 
symbol interval. For notational convenience in the analysis, 
it is assumed that either a probability density function or a 
probability mass function of Px exists, and is denoted by px * 
Let also px{x) = Yl^=iPx(xk) denote the joint (product) 
distribution. 

Let the instantaneous SNR of user k be denoted by srir^ 
and A = diag{.y/srrFi,..., Y/srTFff }. Denote the spreading 
sequence of user k hy Sk = -^[Sik, S2k, ■ ■ ■, SkkV’ where 
Snk are i.i.d. random variables with zero mean and finite 
moments. Let the symbols and spreading sequences be ran¬ 
domly chosen for each user and not dependent on the SNRs. 
The L X K channel “state” matrix is denoted by S' = 
[.y/stTFi Si,..., y/srirK sk\- The synchronous CDMA channel 
with flat fading is described by: 

K 

Y = ysntfc SkXk + N (3) 

= SX + N (4) 

where TV is a vector consisting of i.i.d. zero-mean Gaussian 
random variables. Depending on the domain that the inputs 
and spreading chips take values, the input-output relationship 
0 describes either a real-valued or a complex-valued fading 
channel. 

The linear system 0 is quite versatile. In particular, with 
sntfe = snr for all k, it models the canonical MIMO channel 
in which all propagation coefficients are i.i.d. An example is 
single-user communication with K transmit antennas and L 
receive antennas, where the channel coefficients are not known 
to the transmitter. 

B. Posterior Mean Estimation 

The information-bearing symbol (vector) X is drawn ac¬ 
cording to the prior distribution px- The channel response 
to the input X is an output Y generated according to a 
conditional probability distribution Py\x.s where S is the 
channel state. Upon receiving Y, the estimator would like to 
infer the transmitted symbol X with knowledge of S. 

The most efficient use of the multiuser channel 0 is 
achieved by optimal joint decoding as depicted in Ligure |3 
Due to the complexity of joint decoding, the processing is 

*The results in this paper hold in full generality and do not depend on the 
existence of a probability density or mass function. 
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often separated into multiuser detection followed by single- 
user error-control decoding as shown in Figure 0] A multiuser 
detector front end estimates the transmitted symbols given 
the received signal and the channel state, without using any 
knowledge of the error-control codes employed by the trans¬ 
mitters. Conversely, each single-user decoder only observes 
the sequence of decision statistics corresponding to one user, 
and does not take into account the existence of any other 
users (in particular, it does not use any knowledge of the 
spreading sequences). By adopting this separate decoding 
approach, the channel together with the multiuser detector 
front end is viewed as a bank of coupled single-user channels. 
The detection output sequence for an individual user is in 
general not a sufficient statistic for decoding this user’s own 
information. 

To capture the intended suboptimal structure, one has to 
restrict the capability of the multiuser detector; otherwise the 
detector could in principle encode the channel state and the 
received signal {S,Y) into a single real number as its output 
to each user, which is a sufficient statistic for all users. A 
plausible choice is the (canonical) posterior mean estimator, 
which computes the mean value of the posterior probability 
distribution Px\Y,s^ hereafter denoted by angle brackets (•): 

(X) = E{X|r,S}. (5) 


Also known as the conditional mean estimator, this estimator 
achieves the minimum mean-square error for each user, and 
is therefore the (nonlinear) MMSE detector. We also regard 
it as a soft-output version of the individually optimal mul¬ 
tiuser detector (assuming uncoded transmission). The posterior 
probability distribution Px|y,s is induced from the input 
distribution px and the conditional Gaussian density function 
Py\x.s of the channel @ by the Bayes formula: 


Px\Y.six\y,S) 


Px{x)pY\x,s{y\x,S) 

JPx{x)pY\x.siy\x,S) da;' 


( 6 ) 


The PME can be understood as an “informed” optimal 
estimator which is supplied with the posterior distribution 
Px\Y,s then computes its mean. A generalization of 
the canonical PME is conceivable: Instead of informing the 
estimator with the actual posterior px\Y,s^ w® can supply at 
will any other well-defined conditional distribution 
Given (Y, S), the estimator can nonetheless perform “op¬ 
timal” estimation based on this postulated measure q. We 
call this the generalized posterior mean estimation, which is 
conveniently denoted as 


{X)^^E,{X\Y,S} (7) 

where Eq{ } stands for the expectation with respect to the 
postulated measure q. Eor brevity, we will also refer to Q 
by the name of the posterior mean estimator, or simply the 
PME. In view of 0 , the subscript in m can be dropped if 
the postulated measure q coincides with the actual one p. 

In general, postulating q ^ p causes degradation in detection 
performance. Such a strategy may be either due to lack of 
knowledge of the true statistics or a particular choice that 
corresponds to a certain estimator of interest. In principle, any 


deterministic estimation can be regarded as a PME since we 
can always choose to put a unit mass at the desired estimation 
output given (Y, S). We will see in Section lTl-CI that by postu¬ 
lating an appropriate measure q, the PME can be particularized 
to many important multiuser detectors. As will also be shown 
in this paper, the generic representation 0 allows a uniform 
treatment of a large family of multiuser detectors which results 
in a simple performance characterization for all of them. 

It is enlightening to introduce a new concept: the retrochan- 
nel, which is defined for a given channel and input as a 
companion channel in the opposite direction characterized by 
a posterior distribution. Given the multiuser channel Py\x.s 
with an input px, we have a (canonical) retrochannel defined 
by Px\Y,s 0 - which, upon an input (Y, S), generates a 
random output X according to Px\Y.s- A retrochannel in 
the single-user setting is similarly defined. In general, any 
valid posterior distribution qx\Y.s can be regarded as a 
retrochannel. Note that the retrochannel samples from the 
Bayesian posterior distribution (in general, the postulated one) 
in such a way that, conditioned on the observation, the input to 
the channel and the output of the retrochannel are independent. 
It is clear that the PME output (X)^ is the expected value of 
the output of the retrochannel qx\Y given {Y,S). 

In this paper, the posterior qx\Y,s supplied to the PME is 
assumed to be the one that corresponds to a postulated CDMA 
system, where the input distribution is an arbitrary qx, and the 
input-output relationship of the postulated channel differs from 
the actual channel 0 by only the noise variance. Precisely, 
the postulated channel is characterized by 

Y = SX' + aN' (8) 

where the channel state matrix S is identical to that of the 
actual channel ©, and N' is statistically the same as the 
Gaussian noise N in 0 . The postulated input distribution 
qx is assumed to have zero-mean and finite moments, and 
Qx\y,s is determined by qx and qY\x.s according to the 
Bayes formula. Here, a serves as a control parameter. Indeed, 
the PME so defined is the optimal detector for a postulated 
multiuser system with its input distribution and noise level 
different from the actual ones. In general, the assumed infor¬ 
mation about the channel state S could also be different from 
the actual instances, but this is out of the scope of this work, 
as we limit ourselves to study the (rich) family of multiuser 
detectors that can be represented as the PME parameterized 
by the postulated input and noise level {qx,cr). 

We note that PME under postulated posterior is known in 
the Bayes statistics literature. This technique was introduced 
to multiuser detection by Tanaka in the special case of equal- 
power users with binary or Gaussian inputs under the name of 
marginal-posterior-mode detectors [20], [23]. In this paper we 
pursue further that direction to treat arbitrary input, arbitrary 
power distribution, and generic multiuser detection. 


C. Specific Detectors 

The rest of this section assumes the system model 0 to 
be real-valued. The inputs Xk, the spreading chips Snk, and 
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all entries of N take real values and have unit variance. The 
characteristic of the actual channel is 

\y - S'atp' 


PY\x.s{y\x,S) = ( 27 r) 2 exp 

and that of the postulated channel is 

_ L 

qY\x.s{y\x, S) = ( 27 rCT^) "" exp 


\y~ Sx\\ 


21 


2a2 


(9) 


( 10 ) 


We identify specific choices of the postulated input dis¬ 
tribution qx and noise level a under which the PME is 
particularized to well-known multiuser detectors. 

1) Linear Detectors: Let the postulated input be standard 
Gaussian, qx ~ A/^(0,1). The optimal detector (PME) for the 
postulated model (jsj with standard Gaussian inputs is a linear 
filtering of the received signal Y: 


{X) = 


s^s 


a^I 


S^Y. 


( 11 ) 


The control parameter a can be tuned to choose from the 
single-user matched filter, decorrelator, MMSE detector, etc. 
If O’ ^ 00 , the PME estimate (Ell is consistent with the single- 
user matched filter output: 


(T^ (Xk)^ —> sX In as cr ^ oo. (12) 

If cr = 1, (E) is exactly the soft output of the linear MMSE 
detector. If cr ^ 0, (E) converges to the decorrelator output. 

2) Optimal Detectors: Let the postulated qx be identical 
to the true one, px- The posterior is then 

, I Px{x) 

qx\Y.s[x\y,S) = exp 


-^11. 


(13) 


where Z{y,S) is a normalization factor. 

Suppose that the postulated noise level cr —> 0, then the 
probability mass of the distribution qx\Y.s is concentrated on 
a vector that minimizes |jy — Sa;||, which also maximizes the 
likelihood function PY\x.s{y\x, S). The PME lirng-^o (^)q 
is thus equivalent to that of jointly optimal (or maximum- 
likelihood) detection [1]. 

Alternatively, if cr = 1, then the postulated measure coin¬ 
cides with the actual measure, i.e., q = p. The PME output 
{X) is the mean of the marginal of the conditional posterior 
probability distribution. It is the nonlinear MMSE detector 
for the actual system, and is seen as a soft version of the 
individually optimal detector [1]. 

Also worth mentioning here is that, if cr —s- cx), the PME 
reduces to the single-user matched filter. Indeed, (EJ can be 
shown to hold by noticing from (I13> that 


qx\Y,s{x\y,S) 


Px{x) 


\\y-Sxf 

2ct2 



D. Main Results 

This subsection gives the main results of this paper assum¬ 
ing the real-valued system model. The detailed replica analysis 
for obtaining these results is relegated to Sections nn and EYl 
Results for a complex-valued model are given in Section 
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Fig. 5. (a) The multiuser channel, the (multiuser) PME, and the companion 

(multiuser) retrochannel. (b) The equivalent single-user Gaussian channel, 
PME and retrochannel. 


Consider the multiuser channel Py\x.s given by (|3 with 
input X ~ px, and the posterior mean estimator 0 parame- 
terized by {qx, cr). Section lTl-Cl illustrated the versatility of the 
PME encompassing many well-known detectors. The goal here 
is to quantify the optimal spectral efficiency ^I{X-, Y'|S'), the 
quality of the detection output {Xk)^ for each user k, as well 
as the input-output mutual information I{Xk', {Xk)^ jS). 

Although these performance measures are all dependent on 
the realization of the channel state, such dependence vanishes 
in the large-system asymptote. A large system here refers to 
the limit that both the number of users and the spreading factor 
tend to infinity but with their ratio, known as the system load, 
converging to a positive number, i.e., KjL (3, which may or 
may not be smaller than 1. It is also assumed that the SNRs of 
all users, are i.i.d. with distribution Ps^r, hereafter 

referred to as the SNR distribution. All moments of the SNR 
distribution are assumed to be finite. Clearly, the empirical 
distributions of the SNRs converge to the same distribution 
Psnr as AT ^ oo. Note that this SNR distribution captures the 
(flat) fading characteristics of the channel. 

Given {(3,Psnr,Px,qx,cr), we express the large-system 
limit of the multiuser efficiency and spectral efficiency under 
both separate and joint decoding. 

7) The Decoupling Principle: The multiuser channel 
Py\x.s the multiuser posterior mean estimator parameter¬ 
ized by {qx, cr) are depicted in Eigure |5(a)| together with the 
companion (multiuser) retrochannel qx\Y.s- Here the input to 
the multiuser channel is denoted by JVq to distinguish from 
the output X of the retrochannel. Eor an arbitrary user k, the 
SNR is sntfc, and Xok, Xk and {Xk)^ denote the input symbol, 
the retrochannel output and the PME output, all for user k. 
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In order to show the decoupling result, let us also consider 
the composition of a Gaussian channel, a PME and a com¬ 
panion retrochannel in the single-user setting as depicted in 
Figure |5(b^ The input and output are related by: 

Z = ^/^Xq + -^N (15) 

y/l 

where the input Xq ~ px, snr is the input SNR, N ~ A/’(0,1) 
the noise independent of Xq, and p > 0 the inverse noise 
variance. The conditional distribution associated with the 
channel is 


Pz\x.sm-r){Ax,sm-,r]) 





Let (7z|x,snr;{ represent a Gaussian channel akin to (I15t . the 
only difference being that the inverse noise variance is ^ 
instead of p: 


^2|X,snr;^('^l^7 Stlt, 



(17) 

Similar to that in the multiuser setting, by postulating the 
input distribution to be qx, a posterior probability distribution 
qx\z,sm-i is induced by qx and qz\x,sm-i using the Bayes rule. 
Thus we have a single-user retrochannel defined by qx\z, 
which outputs a random variable X given the channel output 
Z (Figure |5(^. A (generalized) single-user PME is defined 
naturally as (cf. Q): 


(X)^ = E,{X| Z,snr;e}. (18) 


The probability law of the composite system depicted by 
Figure |5(b)| is determined by snr and two parameters p and 
We define the mean-square error of the PME as 

, 2 


£(snr;? 7 ,^) = E 


V(snr;? 7 ,^) = E 


dance of the ret 


snr;p,^ 


and also define the variance of the retrochannel as 

, 2 


snr;p,^ 


(19) 


( 20 ) 


The following is claimed.^ 

Claim 1: Consider the multiuser channel 0 with input 
distribution px and SNR distribution Psnr- Let its output be 
fed into the posterior mean estimator 0 and a retrochannel 
qx\Y,s^ both parameterized by the postulated input qx and 
noise level a (refer to Figure |5(^ . Fix {(},Psur, px,qx,<j). 
Fet Xofc, Xk, and (Xk)^ be the input, the retrochannel output 
and the posterior mean estimate for user k with input signal- 
to-noise ratio sntfc. Then, 

(a) The joint distribution of (Xq^, X^, (X^)^) conditioned 
on the channel state S converges in probability as X ^ oo and 
K/L ^ 13 to the joint distribution of (Xo,X, (X)^), where 
Xq ~ Px is the input to the single-user Gaussian channel (d 
with inverse noise variance r], X is the output of the single- 
user retrochannel parameterized by {qx:^), and (X)^ is the 


^ Since as explained in Section ||] rigorous justification for some of the key 
statistical physics tools (essentially the replica method) is still pending, the 
key res ults in this paper are referred to as claims. Proofs are provided in 
Section Hvl based on those statistical physics tools. 


corresponding posterior mean estimate d, with snr = snrfc 
(refer to Figure |5(b^ . 

(b) The parameter p, known as the multiuser efficiency, 
satisfies together with ^ the coupled equations: 

= 1 + /3 E {snr • £(snr; p,^)} , (21a) 

-l-/3E{snr • ) 2 (snr;? 7 ,^)} , (21b) 


where the expectations are taken over Psnr- In case of multiple 
solutions to J21> . (p,^) is chosen to minimize the free energy 
expressed as^'^ 


P = Pz|5nr;,)(^|snr;?7) loggz|spr;«(^|snr;^)dz| 

+ ^[(^- l)loge-log^] - ilogy - ^loge 
+ 

( 22 ) 

Claim n reveals that, from an individual user’s viewpoint, 
the input-output relationship of the multiuser channel, PME 
and companion retrochannel is increasingly similar to that un¬ 
der a simple single-user setting as the system becomes large. In 
other words, given the three (scalar) input and output statistics, 
it is not possible to distinguish whether the underlying system 
is in the (large) multiuser or the single-user setting as depicted 
in Figures |5(a)| and |5(b)| respectively. It is also interesting to 
note that the (asymptotically) equivalent single-user system 
takes an analogous structure as the multiuser one. 

Obtained using the replica method, the coupled equations 
(ED may have multiple solutions. This is known as phase 
coexistence in statistical physics. Among those solutions, the 
thermodynamically dominant solution is the one that gives the 
smallest value of the free energy (ED- This is the solution 
that carries relevant operational meaning in the communication 
problem. In general, as the system parameters (such as the 
load) change, the dominant solution may switch from one 
of the coexisting solutions to another. This phenomenon is 
known as phase transition (refer to Section EH for numerical 
examples). 

The single-user PME dH is merely a decision function ap¬ 
plied to the Gaussian channel output, which can be expressed 
explicitly as 


Eq{X\ Z,snr-^} 


gi(Z,snr;^) 

go(^,snr;^) 


(23) 


where we define the following useful functions for all positive 
integers i = 0,1,...: 


q^z, snr; ^) = E, { X* qz\x ,0 | snr} , (24) 


where the expectation is taken over qx- Note that 
(70(2:, snr; ,f) = qz|snr;^(-^Isnr; ^). The decision function (1^ is 
in general nonlinear. Due to Claim [H although the multiuser 
PME output (Xfe)^ is in general non-Gaussian, it is in fact 
asymptotically a function (the decision function ( I23H of a 


^Ttie base of logarithm is consistent with the unit of information measure 
in this paper unless stated otherwise. 

^The integral with respect to 2 is from —00 to 00 . For notational simplicity 
we omit integral limits in this paper whenever they ai'e clear from context. 










conditional Gaussian random variable Z centered at the actual 
input Xk scaled by ^sntfe with a variance of 77 “^. 

Corollary 1: In the large-system limit, the channel between 
the input Jfofe and the multiuser posterior mean estimate {Xk)^ 
for user k is equivalent to the Gaussian channel Pzix.sm-.T] 
concatenated with the one-to-one decision function ( I23> with 
snr = snrfc, where rj is the multiuser efficiency determined by 
Claim n 

As shown in Section HV-BI for fixed snr and the decision 
function J23> is strictly monotone increasing in Z. Therefore, 
in the large-system limit, given the detection output {Xk)^, 
one can apply the inverse of the decision function to recover 
an equivalent conditionally Gaussian statistic Z. Note that 
77 G [0,1] from ( I21a> . It is clear that, in the large-system 
limit, the multiple-access interference is consolidated into an 
enhancement of the thermal noise by 77 “^, i.e., the effective 
SNR is reduced by a factor of 77 , hence the term multiuser 
efficiency. Equal for all users, the multiuser efficiency solves 
the coupled fixed-point equations (ED- Indeed, in the large- 
system limit, the multiuser channel with the PME front end can 
be decoupled into a bank of independent single-user Gaussian 
channels with the same degradation in each user’s SNR. This 
is referred to as the decoupling principle. 

Since the decision function is one-to-one, it is inconse¬ 
quential from both the detection and the information theoretic 
viewpoints. Hence the following result: 

Corollary 2: In the large-system limit, the mutual informa¬ 
tion between input symbol and the output of the multiuser 
posterior mean estimator for a particular user is equal to 
the input-output mutual information of the equivalent single- 
user Gaussian channel with the same input distribution and 
SNR, and an inverse noise variance 77 equal to the multiuser 
efficiency given by Claim [2 

According to Corollary |2 the mutual information 
I [Xk] {Xk) IS') for a user with signal-to-noise ratio stir^ = 
snr converges to a function of the effective SNR defined as 

/(77Snr) = D {pz\x.,sm-ri\\pz\sm--q\px) , (25) 

where D (• || • | •) stands for conditional (Kullback-Leibler) 
divergence, and Pz\snr,r] is the marginal distribution of the 
output of the channel (HD. The overall spectral efficiency 
under separate decoding is the sum of the single-user mu¬ 
tual informations divided by the dimension of the multiuser 
channel (spreading factor L), which is simply 


Csep(/3) =/ 3 E {1(77 snr)} , ( 26 ) 


where the expectation is over Ps^. 

In general, it is straightforward to determine the multiuser 
efficiency 77 (and the inverse noise variance 0 by solving 
the joint equations (ED- Define the following functions akin 
to MArl : 


Pi{z,snr,p) = E{X’’pz\x.sm-ri{z\X,snr;r]) \ snr) . ( 27 ) 

Some algebra leads to 


£(snr;77,0 = 1 -f 


f , x 9 f( 2 ,snr ;0 


- 2pi(2:,snr; 77) 


gi(z,snr;0 

9 o(^:,snr ;0 


( 28 ) 


dz 


and 


V(snr;77,0 = / po(^,snr;77) 


-770(2;, snr; 77) 


72(2;, snr; 0 
70(2:, snr; 0 

' 7 ?( 2 ;,snr ;0 


( 29 ) 


( 7 §( 2 ;,snr ;0 


dz. 


Numerical integrations can be applied to evaluate (ED and 
M9\ in general. It is then viable to find solutions to the 
joint equations (ED numerically. In case of multiple sets of 
solutions, the ambiguity is resolved by choosing the one that 
minimizes the free energy (ED. Note that the mean-square 
error and variance often admit simpler expressions than (ED 
and \29\ under certain practical inputs, which may ease the 
computation significantly (see examples in Section Hl-Ek 

2) Optimal Detection and Spectral Efficiency: Among all 
multiuser detection schemes, the individually optimal detector 
has particular importance. As we shall see, the optimal spectral 
efficiency achievable by joint decoding is also tightly related 
to the multiuser efficiency of optimal detection. 

As shown in Section III-CI the soft individually optimal 
detector can be regarded as a PME with a postulated measure 
that is exactly the same as the actual measure, i.e., q = p. 
Consider the channel, PME and retrochannel in the multiuser 
setting as depicted in Eigure |5(a)| It is clear that in case 
of optimal detection, the input Xg to the multiuser channel 
and the retrochannel output X are i.i.d. given (Y, S). The 
decoupling principle stated in Claim ^ can be particularized 
in the case of q = p. Easily, the multiuser efficiency and the 
postulated inverse noise variance satisfy joint equations: 


77 ^ = 1 -I-/3 E {snr - £’(snr;77,0} , (30a) 

= 1-I-/3 E {snr - )2(snr;77,0} . (30b) 


Due to the replica symmetry assumption, and noting that 
£{snr;x,x) = V(snr;x,a;) for all x, we take the solution 
77 = 0 It should be cautioned that may have other 
solutions with 77 ^ ^ in the unlikely case that replica symmetry 
does not hold for optimal detection. 

In the equivalent single-user setting (Eigure |5(b)t , the above 
arguments imply that the postulated channel is also identical 
to the actual channel, and X and Xq are i.i.d. given Z. The 
posterior mean estimate of X given the output Z is 

(X) = E{X|Z,snr;77}. ( 31 ) 

Clearly, (X) is also the (nonlinear) MMSE estimate, since it 
achieves the minimum mean-square error: 

mmse(77 snr) = E | (X — (X))^ | snr; 77} . ( 32 ) 

Indeed, 


£’(snr; a:, a;) = ) 2 (snr; X, a;) = mmse(a;snr), Vx. ( 33 ) 

The following is a special case of Corollary [2 for the 
individually optimal detector. 

Claim 2: In the large-system limit, the distribution of the 
output (Xfe) of the individually optimal detector for the mul¬ 
tiuser channel (ID conditioned on Xk = x being transmitted 
with signal-to-noise ratio sntfc is identical to the distribution of 
the posterior mean estimate (X) of the single-user Gaussian 
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channel (IB) conditioned on Xq = x being transmitted with 
snr snrfc, where the optimal multiuser efficiency r] satisfies 
a fixed-point equation: 


= 1-f/3 E {snr • mmse (?7 snr)} . (34) 

The single-user PME ( 13 U is a (nonlinear) decision function 
that admits an expression as (I23> with q replaced by p. The 
MMSE can be computed as 


mmse(77 snr) = 1 — 


pl{z,snr;r]) 

Po{z,snr,r]) 


(35) 


Solutions to the fixed-point equation (01 can in general be 
found numerically. There are cases in which \?>A\ has more 
than one solution. The ambiguity is resolved by taking the one 
that minimizes the free energy (03 with ^ = p, or equivalently, 
as we shall see next, the optimal spectral efficiency. 

The single-user mutual information is given by (03 due to 
Corollary Ol where the multiuser efficiency is now given by 
Claim 01 The optimal spectral efficiency under joint decoding 
is greater than that under separate decoding (I26t . where the 
increase is given by the following: 

Claim 3: The spectral efficiency gain of optimal joint de¬ 
coding over individually optimal detection followed by sepa¬ 
rate decoding of the multiuser channel (0 is determined, in 
the large-system limit, by the optimal multiuser efficiency as 


Cjoint(/3) - Csep(/3) = 1) loge - log 77 ] (36) 

= D(AA(0,r7)||AA(0,l)). (37) 


In other words, the spectral efficiency under joint decoding is 


Cjoint(/3) = /3E{/(77snr)} + ^[{r]- l)loge - log77]. ( 38 ) 

In case of multiple solutions to ( 01 , the optimal multiuser 
efficiency rj is the one that gives the smallest Cjoint- 

Indeed, Muller’s conjecture on the mutual information loss 
[25] is true for arbitrary inputs and SNRs. Incidentally, the 
loss is identified as a divergence between two Gaussian 
distributions in (Ol- 

Equal-power Gaussian input is the first known case that 
admits a closed-form solution for the multiuser efficiency [ 1 , 
p. 305] and thus also the spectral efficiencies. The spectral 
efficiencies under joint and separate decoding were found for 
Gaussian inputs with fading in [11], and then found implicitly 
in [23] and later explicitly in [24] for equal-power users 
with binary inputs. Eormula (Elll is the first general result for 
arbitrary input distributions and received powers. 

Interestingly, the spectral efficiencies under joint and sepa¬ 
rate decoding are also related by an integral equation, given 
in [11, (160)] for the special case of Gaussian inputs. 

Theorem 1: Regardless of the input and power distribu¬ 
tions, 

fP I 

Cjoint(/3)=/ -Qep(/3')d/3'. (39) 

Jo P 

Proof: Since Cjoint(O) = 0 trivially, it suffices to show 


By (E3 and (ED, it is enough to show 

/3-^E{/(77snr)}-bi-^[(?7-l)loge-log77] =0. (41) 

Noticing that the multiuser efficiency 77 is a function of the 
system load /3, (ED is equivalent to 

-^E{/(77snr)} + ^(l-77-i)loge = 0. (42) 

By a recent formula that links the mutual information and 
MMSE in Gaussian channels [41],^ 

Id snr 

-—/(77snr) = —— mmse(77snr). (43) 

log e d 77 ^^ 2 ’ 

Thus (E3 holds as 77 satisfies the fixed-point equation (Ol- ■ 
Theorem ^ is an outcome of the chain rule of mutual 
information, which holds for all inputs and arbitrary number 
of users: 

K 

/(X; Y\S) = ^i^C,Y\S, Xfe+i,.. .,Xk). (44) 

k^l 

The left hand side of ll44l is the total mutual information of 
the multiuser channel. Each mutual information in the right 
hand side of (I44t is a single-user mutual information over the 
multiuser channel conditioned on the symbols of previously 
decoded users. As argued below, the limit of (EU as AT —!■ 00 
becomes the integral equation ( I39> . 

Consider an interference canceler with PME front ends 
against yet undecoded users that decodes the users succes¬ 
sively in which reliably decoded symbols are used to re¬ 
construct the interference for cancellation. Since the error 
probability of intermediate decisions vanishes with code block- 
length, the interference from decoded users are asymptotically 
completely removed. Assume without loss of generality that 
the users are decoded in reverse order, then the PME for user 
k sees only k — 1 interfering users. Hence the performance 
for user k under such successive decoding is identical to 
that under multiuser detection with separate decoding in a 
system with k instead of K users. Nonetheless, the equivalent 
single-user channel for each user is Gaussian by Corollary [D 
The multiuser efficiency experienced by user k, r]{k/L), is a 
function of the load k/L seen by the PME for user k. By 
Corollary |2] the single-user mutual information for user k is 
therefore 

I {r]{k/L) snrk). (45) 


Since sntfe are i.i.d., the overall spectral efficiency under 
successive decoding converges almost surely: 


i^/(77(fc/L)snrfc) ^ E 


I{/3' snr) d/3' 


(46) 


Note that the above result on successive decoding is true 
for arbitrary input distribution and arbitrary PME detectors. In 
the special case of individually optimal detection, for which 
the postulated system is identical to the actual one, the right 


/3-^Cjoint(/3) = Csep(/3). 


(40) 


^In fact, the proof of Theorem 0 led us to the discovery of the general 
I-MMSE relationship in [41]. 
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hand side of J46l l is equal to Cjoint(/3) by Theorem ^ We can 
summarize this principle as: 

Claim 4: In the large-system limit, successive decoding 
with an individually optimal detection front end against yet 
undecoded users achieves the optimal CDMA channel capacity 
under arbitrary constraint on the input. 

Claim 0] is a generalization of the result that a successive 
canceler with a linear MMSE front end against undecoded 
users achieves the capacity of the CDMA channel under 
Gaussian inputs.^ 


E. Recovering Known Results 

As shown in Ill-CI several well-known multiuser detectors 
can be regarded as appropriately parameterized PMEs. Thus 
many previously known results can be recovered as special 
case of the new findings in Section ffl-DI 

1) Linear Detectors: Let the postulated prior qx be stan¬ 
dard Gaussian so that the PME represents a linear multiuser 
detector. Since the input Z and output X of the retrochannel 
are jointly Gaussian (refer to Pigure |5(b^ , the single-user PME 
is simply a linear attenuator: 


^ '<1 1-fesnr ■ 


Erom J19t . the mean-square error is 


^(snr;77,C) = E ■ 


Xn- 


1 + ^snr 


snrATi 


0 




rj + ^^snr 


(47) 


48) 


(49) 


77(1 -I- ^snr)2 

Meanwhile, the variance of X conditioned on Z is independent 
of Z. Hence the variance (| 20 j of the retrochannel output is 
independent of 77 : 

1 


V(snr;p,^) = . . 

1 -I- 4snr 

Erom Claim n one finds that ^ is the solution to 

.--1 


= cr^ -f/? E 


1 -f ^snr 

and the multiuser efficiency is determined as 


p = C + ? - 1 ) 


1 + /3E 


-1 


(50) 


(51) 


(52) 


(1 + ^snr )2 

Clearly, the large-system multiuser efficiency of such a linear 
detector is independent of the input distribution. 

Suppose also that the postulated noise level a ^ 00 . The 
PME becomes the matched filter. One finds —> 1 by (ED 

and consequently, the multiuser efficiency of the matched filter 
is [ 1 ] 

„(mf) ^ - (53) 

^ l + /3E{snr} 

In case a = 1, one has the linear MMSE detector. By (ED, 
77 = ^ and by (1.5 H . the multiuser efficiency s^jisfies 


77 ^ = 1 -I- /3 E 


snr 


1 + rjsnr 


(54) 


®This principle, originally discovered by Varanasi and Guess [42], has been 
shown with other proofs and in other settings [10], [43], [44], [45], [46], [47], 


which is the Tse-Hanly equation [ 6 ], [10]. The fixed-point 
equation ( I54> has a unique positive solution. 

By letting a ^ 0 one obtains the decorrelator. If /3 < 1, 
then (ED gives ^ > 00 and ^ 1 — P, and the multiuser 

efficiency is found as 77 = 1 — /3 by (ED regardless of the 
SNR distribution (as shown in [1]). If /3 > 1, and assuming 
the generalized form of the decorrelator as the Moore-Penrose 
inverse of the correlation matrix [ 1 ], then ^ is the unique 
solution to 


r' = /3E 


1 -f Csnr 


(55) 


and the multiuser efficiency is found by (1.52> with cr = 0. In 
the special case of identical SNRs, an explicit expression is 
found [7], [ 8 ] 


^(dec) _ 


P-1 


P -f snr(/3 — 1)2 ’ 


P>1. 


(56) 


By Corollary [2 the mutual information with input distribu¬ 
tion px for a user with snr under linear multiuser detection is 
equal to the input-output mutual information of the single-user 
Gaussian channel (ED with the same input: 


(^>9 |snr) =/( 77 snr), (57) 

where 77 depends on which type of linear detector is in use. 
Gaussian priors are known to achieve the capacity: 

C(snr) = i log(l-I- 77 snr). (58) 

By Corollary |D the total spectral efficiency under Gaussian 
inputs is expressed in terms of the linear MMSE multiuser 
efficiency: 

=1 E {log (1 + 77 «snr)} 

2 . (59) 

+ _ loge- log774“™‘=>] . 

This is Shamai and Verdu’s result for fading channels [11]. 

2) Optimal Detectors: Using the actual input distribution 
Px as the postulated prior of the PME results in optimum 
multiuser detectors. In case of the jointly optimal detector, the 
postulated noise level cr = 0 , and (ED becomes 

77 “^ = 1-I-/3 E {snr • £(snr; 77 ,^)} , (60a) 

= /3E{snr-V(snr; 77 ,f)}, (60b) 

where £{■) and V(-) are given by (I28t and M9\ respectively 
with ( 7 i(z, snr; x) = ppz, snr; x), \/x. The parameters can then 
be solved numerically. 

In case of the individually optimal detector, one sets <7 = 1 
so that q = p. The optimal multiuser efficiency 77 is the solution 
to the fixed-point equation (ED given in Claim El 

It is of practical interest to find the spectral efficiency under 
the constraint that the input symbols are antipodally modulated 
as in the popular BPSK. In this case, the probability mass 
function px{x) = 1/2, x = ± 1 , maximizes the mutual 
information. It can be shown that 

2 

f e ^ 

mmse( 7 ) = 1 — / _ tanh (7 — z^/y) dz. (61) 

J V 27r 
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By Claim |2 The multiuser efficiency, ri^^\ where the super¬ 
script (b) stands for binary inputs, is a solution to the fixed- 
point equation [8]: 


1 

V 


1 + /3E 


snr 




tanh (psnr 


— Zy/rjsrn) dz 


(62) 

which is a generalization of an earlier result assuming equal- 
power users due to Tanaka [23]. The single-user channel 
capacity for a user with signal-to-noise ratio snr is the same 
as that obtained by Muller and Gerstacker [24] and is given 
by 


C*'’*(snr) = — ^ _ log cosh — z^/rT^snr^ dz 


snr log e. 


(63) 


The total spectral efficiency of the CDMA channel subject to 
binary inputs is thus 


r-(b) 

joint 


=f 3 Ei- 


e 2 


log cosh ^? 7 *'’^snr — z\/ri^^hnr^ dz 


+ (3rf°^ Esnrloge -f - — l) loge — logp^'’*] 


(64) 


which is also a generalization of Tanaka’s implicit result [23]. 


III. Communications and Statistical Physics 

This section briefs the reader with concepts and methodolo¬ 
gies that will be needed to prove the results summarized in 
Section nTui Although one can work with the mathematical 
framework only and avoid foreign concepts, we believe it 
is more enlightening to draw an equivalence between mul¬ 
tiuser communications and many-body problems in statistical 
physics. Such an analogy is seen in a embryonic form in [23] 
and will be developed to a full generality here. 

A. A Note on Statistical Physics 

Consider the physics of a many-body system, the micro¬ 
scopic state of which is described by the configuration of 
some K variables as a vector x. The state of the system 
evolves over time according to some physical laws. Let the 
energy associated with the state, called the Hamiltonian, be 
denoted by the function H(x). Let p{x) denote the probability 
that the system is found in configuration x. Then, at thermal 
equilibrium, the energy of the system 

£ = ^^p{x)H{x) (65) 

X 

is preserved, while the Second Law of Thermodynamics 
dictates that the entropy (disorder) of the system 

S = - ^p{x) logp(£c) (66) 

X 

is maximized. Although we are unable to follow the exact 
trajectory of the configuration, e.g., we do not know the exact 
configuration a; at a given time, the probability distribution 


of the configuration can be determined using the Lagrange 
multiplier method. Indeed, using (I65t and J66t . the equilibrium 
probability distribution p{x) is found to be negative expo¬ 
nential in the Hamiltonian, which is known as the Boltzmann 
distribution: 


p{x) = Z ^ exp 



(67) 


where 


Z = 

X 



( 68 ) 


is the partition function, and the temperature T > 0 is 
determined by the energy constraint (I65> . The most probable 
configuration is the ground state which has the minimum 
Hamiltonian. Generally speaking, statistical physics is a theory 
that studies macroscopic properties (e.g., pressure, magneti¬ 
zation) of such a system starting from the Hamiltonian by 
taking the above probabilistic viewpoint. One particularly 
useful macroscopic quantity of the thermodynamic system is 
the free energy: 

T = £-TS. (69) 


Using - (I68t . one finds that the free energy at equilibrium 
can also be expressed as 


T=-T logZ. 


(70) 


Indeed, at thermal equilibrium, the temperature and energy 
of the system remain constant, the entropy is the maximum 
possible, and the free energy is at its minimum. The free 
energy is often the starting point for calculating macroscopic 
properties of a thermodynamic system. 


B. Multiuser Communications and Spin Glasses 

The communication problem faced by the detector is to 
infer statistically the information-bearing symbols given the 
received signal and knowledge about the channel state. Nat¬ 
urally, the posterior probability distribution plays a central 
role. In the multiple-access channel 0 , the channel state 
consists of the spreading sequences and the SNRs, collectively 
represented by the matrix S. The channel is described by the 
Gaussian density Py\x,s given by 0. By postulating an input 
qx and a channel dlOt which differs from the actual one only 
in the noise level, the postulated posterior distribution can be 
obtained by using the Bayes formula (cf. 0 ) as 

\\y-Sx\\^ ' 
2a2 

(71) 

where 


x\Y,six\y, S) — 


(2 


^TTCT^) 


qxix) 




qY\s{y\S) = (27rcr2) " Eg 


exp 


2a2 



and the expectation in d72t is taken conditioned on S over X 
with distribution qx- 

In order to take advantage of the statistical physics method¬ 
ologies, we create an artificial thermodynamic system, called 
spin glass, that is equivalent to the communication problem. 
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In certain special cases, this connection is found in [23], while 
we now draw this analogy in the general setting. A spin glass 
is a system consisting of many directional spins, in which 
the interaction of the spins is determined by the so-called 
quenched random variables whose values are determined by 
the realization of the spin glass. An example is a system 
consisting molecules with magnetic spins that evolve over 
time, while the positions of the molecules that determine the 
amount of interactions are random (disordered) but remain 
fixed for each concrete instance as in a piece of glass. Let 
the microscopic state of a spin glass be denoted by a K- 
dimensional vector x, and the quenched random variables 
by {y,S). The system can be understood as K random 
spins sitting in quenched randomness {y, S), and its statistical 
physics described as in Section ID with a parameterized 
Hamiltonian Hy,s{x). 

Indeed, suppose the temperature T = 1 and that the 
Hamiltonian of a piece of spin glass is defined as 

Hy,s{x) = -log(7x(a:) + ^ log (2^^) , (73) 

then the configuration distribution of the spin glass at equilib¬ 
rium is given by (ED and its corresponding partition function 
by M2\ (cf. \61\ and (I68H . Precisely, the probability that the 
transmitted symbol is X = at under the postulated model, 
given the observation Y and the channel state S, is equal to 
the probability that the spin glass is found at conhguration 
X, given the quenched random variables {Y, S). Note that 
Gaussian distribution is a natural Boltzmann distribution with 
squared Euclidean norm as the Hamiltonian. 

The richness of the system is encoded in the quenched 
randomness {Y,S). In the communication channel described 
by (E}, (Y, S) takes a specific distribution, i.e., it is a realiza¬ 
tion of the received signal and channel state matrix according 
to the prior and conditional distributions that underlie the 
“original” spins. Indeed, the communication system depicted 
in Figure |5(^ can be also understood as a spin glass X subject 
to physical law q sitting in the quenched randomness caused by 
another spin glass Xq subject to physical law p. The channel 
corresponds to the random mapping from a given spin glass 
configuration to an induced quenched randomness. Conversely, 
the retrochannel corresponds to the random mechanism that 
maps some quenched randomness into an induced spin glass 
configuration distribution. 

The free energy of the thermodynamic (or communication) 
system normalized by the number of users is (T = 1) 

-LiogZ{Y,S) = -^logqYis{Y\S). (74) 

Due to the self-averaging assumption, the randomness of 
E3 vanishes as AT ^ oo. As a result, the free energy per 
user converges in probability to its expected value over the 
distribution of the quenched random variables {Y, S) in the 
large-system limit, which is denoted by T, 

.F=-^m^E|llog<7^|s(y|5)|. (75) 

Hereafter, by the free energy we refer to the large-system limit 
E3, which will be calculated in Section nYi 


The reader should be cautioned that for disordered systems, 
thermodynamic quantities may or may not be self-averaging 
[48]. The self-averaging property remains to be proved or 
disproved in the CDMA context. This is a challenging problem 
on its own. Buttressed by numerical examples and associated 
results using random matrix theory, in this work the self¬ 
averaging property is assumed to hold. 

The self-averaging property resembles the asymptotic 
equipartition property (AEP) in information theory [49]. An 
important consequence is that a macroscopic quantity of a 
thermodynamic system, which is a function of a large number 
of random variables, may become increasingly predictable 
from merely a few parameters independent of the realization of 
the random variables as the system size grows without bound. 
Indeed, such a macroscopic quantity converges in probability 
to its ensemble average in the thermodynamic limit. 

In the CDMA context, the self-averaging property leads 
to the strong consequence that for almost all realizations of 
the received signal and the spreading sequences, macroscopic 
quantities such as the BER, the output SNR and the spec¬ 
tral efficiency, averaged over data, converge to deterministic 
quantities in the large-system limit. Previous work (e.g. [6], 
[9], [10]) has shown convergence of performance measures 
for almost all spreading sequences. The self-averaging prop¬ 
erty results in convergence of certain empirical performance 
measures, which holds for almost all realizations of the data 
as well as noise. 


C. Spectral Efficiency and Detection Performance 

Consider the multiuser channel, the multiuser PME and the 
companion retrochannel as depicted in Figure |5(a)| Equipped 
with the statistical physics concepts introduced in Illl-AI and 
HITEl this subsection associates the spectral efficiency and 
detection performance of such a system with more tangible 
quantities for calculation. 

1) Spectral Efficiency and Free Energy: For a fixed input 
distribution px, the total input-output mutual information of 
the multiuser channel is 


/(X;Y|5) = E log 


Py\s{Y\S) 

= E{logpy|s(X|S)| 5}- 


(76) 

■^log(27re).(77) 


where the simplification to (ED is because Py\x.s given by 
(E) is an L-dimensional Gaussian density. Calculating ED 
is formidable for an arbitrary realization of S. However, 
due to the self-averaging property, it suffices to evaluate its 
expectation over the spreading sequences. In view of (I75> . the 
large-system spectral efficiency is affine in the free energy with 
a postulated measure q identical to the actual measure p: 


C = ^I{X-Y\S) 

= -/3E|llogpv|s(X|S) 
^ - ilog(27re). 



(78) 

i log(27re) (79) 


( 80 ) 
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Relationship is a full generalization of a previous ob¬ 
servation [23, (82)] in some special cases. In fact, the analogy 
between free energy and information-theoretic quantities has 
also been noticed in belief propagation [50], [51], coding [52] 
and optimization problems [53]. 

2 ) Detection Performance and Moments: In case of a mul¬ 
tiuser detector front end, one is interested in the quality of the 
detection output for each user, which is completely described 
by the distribution of the detection output conditioned on the 
input. Let us focus on an arbitrary user k, and let Xok, {Xk)q 
and Xk be the input, the PME output, and the retrochannel 
output, respectively (cf. Figure |5(^ . Instead of the conditional 
distribution P{Xk) IXos,’ we solve a more ambitious problem: 
the joint distribution of (Xq^, {Xk)^ , Xk) conditioned on the 
channel state S in the large-system limit. 

Our approach is to calculate the joint moments 


{Xk)\ 



Lj,^ = 0,1,... 


( 81 ) 


By the self-averaging property, each moment, as a function of 
the channel state S, converges to the same value for almost 
all realizations of S. Thus it suffices to calculate 


^{xikXi {Xk)l} (82) 

as K oo, which is viable by studying the free energy 
associated with a modified version of the partition function 
(| 23 . More on this later. 

The joint distribution becomes clear once all the moments 
(I82> are determined, so does the relationship between the 
detection output {Xk)^ and the input X^k- It turns out the 
large-system joint distribution of {X^k, {Xk)^ ,Xk) is exactly 
the same as that of the input, PME output and retrochannel 
output associated with a single-user Gaussian channel with the 
same input distribution but with a degradation in the SNR. 
In other words, the subchannel seen by an individual user is 
essentially equivalent to a single-user Gaussian channel in the 
large-system limit. The mutual information between the input 
and the detection output for user k is expressed as 


I{Xok-.{Xk)g\S), (83) 

which can be obtained once the input-output relationship is 
known. It will be shown that conditioning on the channel state 
S becomes superfluous as K ^ oo. 

We have distilled our problems under both joint and separate 
decoding to finding some ensemble averages, namely, the free 
energy (|75ll and the joint moments (|83. In order to calculate 
these quantities, we resort to a powerful technique developed 
in the theory of spin glass, the heart of which is sketched in 
the following subsection. 


D. Replica Method 

Direct calculation of the free energy in (I80t is hard. In 1975, 
S. E Edwards and P. W. Anderson [26] invented the replica 
method to study the free energy of magnetic and disordered 
systems, which has since become a standard technique in 
statistical physics [39]. The replica method was introduced 
to the field of multiuser detection by Tanaka [23] to analyze 
the optimal detectors under equal-power Gaussian or binary 



Fig. 6. The replicas of the retrochannel. 


input (see also [54]). Concurrent to our work [8], [55], [56], 
[57], the replica method has also been used to analyze large 
dual antenna systems [58] and belief propagation decoding of 
CDMA [59], [35], [60], [61]. 

Essentially, the replica method takes the following steps: 

1) Reformulate (|73 as 

.F = - hm llim |-logE{Z“(y,5)} (84) 

K^oo J\ li —OU 

where Z{Y, S) = qy|s(l^|5). The equivalence of (17 5 1 
and (I84t can be verified by noticing that for all 0 > 0, 

hm log E {©“} = lim EiejlogB} ^ ^ 
u^o OU u^o E]0“1 

(85) 

2) For an arbitrary positive integer u, calculate 

- lim llogE{Z“(r,S)} (86) 

K^oo J\ 

by introducing u replicas of the system (hence the name 
“replica” method). 

3) Assuming the resulting expression from Step 2 to be 
valid for all real-valued u at the vicinity of it = 0, take 
its derivative at it = 0 to obtain the free energy (I84> . It is 
also assumed that the limits in (I84t can be interchanged. 

Note that the validity of the replica method hinges on the 
two assumptions made in Step 3. We now elaborate on how 
to perform Step 2, i.e., how to calculate for an integer it, 
henceforth referred to as the replica number. 

For an arbitrary positive integer u, we introduce u indepen¬ 
dent replicas of the retrochannel (or the spin glass) with the 
same received signal Y and channel state S as depicted in 
Figure The partition function of the replicated system is 

Z“(i/,S) = eJ ngy|x,s(2/|X„,5) 

I a=l 

where the expectation is taken over the replicas {2fafe|a = 
1,..., It, A: = 1,..., AT}. Here, Xak are i.i.d. (with distribu¬ 
tion qx) since {Y, S) are given. With the new expression (I87t 
using the replicas, we proceed as follows. Since qY\Xa,s is a 
conditional Gaussian density, their product in (I87> is a scaled 
version of another Gaussian density conditioned on S and all 
Xa. By taking the integral with respect to y first and then 
averaging over the spreading sequences, one finds that 



llogE{Z“(Y,5)} = llogE 


exp 




( 88 ) 
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where is some function of the SNRs and the transmitted 
symbols and their replicas, collectively denoted by a .ff x (m + 
1) matrix X = [Xq, ..., Xu]. 

The replica method then exploits the symmetry in X in 
order to evaluate (I88> . Instead of calculating the expectation 
(18 8> with respect to X all at once, we do it by first condition¬ 
ing on the correlation matrix Q = (1/K)X^A^X. It turns 
out that conditioned on the replica correlation matrix Q, the 
expectation with respect to X is equivalent to an integral over 
a multivariate Gaussian distribution due to the central limit 
theorem, which helps to reduce (I88t to: 


K 


log / exp 




lip{dQ)+0 


K 


(89) 


where is some function (independent of K) of the (u + 
1) X (u + 1) random correlation matrix Q, and is the 
probability measure of Q. 

Since for each pair (a,6), Qab = ^Y.k=i^'^'^kXakXbk 
is a sum of independent random variables, the probability 
measure satisfies the large deviations property. Indeed, 
by Cramer’s Theorem [62], there exists a rate function /(“( 

(u) 

such that the measure satisfies 

- lim 4logp^^(xf)= inf log e (90) 

if—too K QeA 


for all measurable sets ^ of (m + 1) x {u + 1) matrices. The 
rate function /G) is obtained through the Legendre-Fenchel 
transform of the cumulant generating function of pj^ . A key 
observation is that as K oo, the mass of the integral in ( I89> 
concentrates on a particular subshell of Q. Using Varadhan’s 
theorem [62], ( I89> is found to converge to 


sup [iG(“)(Q)-/(“)(Q) 
Q iP 


loge. 


(91) 


Seeking the extremum (ED over a [u + 1)^-dimensional 
space is hard. It turns out that in many problems the supremum 
in Q satisfies replica symmetry, namely, that the supremum in 
Q is identical over all replicated dimensions. Assuming replica 
symmetry holds, the supremum is over merely a few order 
parameters, and the free energy can be obtained analytically. 
The validity of replica symmetry can be checked by calculating 
the Hessian of [/3“^G(“^ — /(“)] at the replica symmetric 
supremum [27]. If the Hessian is positive definite, then the 
replica symmetric solution is stable against replica symmetry 
breaking, and it is the unique solution because of the convexity 
of the function [/3“^G(“( — /(“^]. Under equal-power binary 
input and individually optimal detection, [23] showed that if 
the system parameters satisfy certain condition, the replica- 
symmetric solution is stable against replica symmetry breaking 
(see also [63]). In some other cases, replica symmetry can 
be broken [35]. Unfortunately, there is no known general 
condition for replica symmetry to hold. The replica-symmetric 
solution, assumed for analytical tractability in this paper, is 
consistent with numerical results in the experiments shown in 
Section EH 

At any rate, the supremum ( 19 H can be obtained as a 
function of the replica number u. The final step is to continue 
the expression to real-valued u and take the derivative at 


M = 0. The free energy is thus found and the mutual 
information obtained by (180> . 

The replica method is also used to calculate the moments 
(113. Clearly, Xq—(X, S')—[Xi,..., X„] is a Markov 
chain. The moments (113 are equivalent to some moments 
under the replicated system: 

j}^^^\^^okXi,,Y[Xuk'^ (92) 

where we choose m > I, which can be readily evaluated by 
working with a modified partition function akin to (I87t . 

We remark that the essence of the replica method here is 
its capability of converting a difficult expectation (e.g., of a 
logarithm) with respect to a given large system to an expecta¬ 
tion of a simpler form with respect to the replicated system. 
Quite different from conventional techniques is the emphasis 
of large systems and symmetry from the beginning, where the 
central limit theorem and large deviations help to calculate the 
otherwise intractable quantities. The fact that certain statistics 
converge to a Gaussian distribution in the thermodynamic limit 
is central to the application of replica theory and to practical 
algorithms based upon the fixed-disorder equivalent of replica 
theory (i.e., the TAP approach [27]). Another technique that 
takes advantage of the asymptotic normality is the so-called 
“cavity method” in [39]. 

Following the replica recipe outlined above, a more detailed 
analysis of the real-valued channel is carried out in Section lTVI 
The complex-valued counterpart is discussed in Section El 
As previously mentioned, while the replica trick and replica 
symmetry are assumed to be valid as well as the self-averaging 
property, their rigorous justification is still an open problem 
in mathematical physics. 

IV. Proofs Using the Replica Method 

This section proves Claims \IM using the replica method. 
The free energy dlS is first obtained and then the spectral 
efficiency under joint decoding is derived. The joint moments 
(18 2> are then found and it is demonstrated that the multiuser 
channel can be effectively decoupled. For notational conve¬ 
nience, natural logarithms are assumed throughout this section. 


A. Free Energy 


We will find the free energy by dlS and then the spectral 
efficiency follows immediately from dH. From (|3, (d and 

i3, 

E{Z^(Y,S)} 

= PY\xMy\Xo. s) n q^lxAvl^a. s) dy J(93) 



-i||y-5Xof 


U 

X exp 
0—1 

where the expectations are taken over the channel state matrix 
S, the original symbol vector Xg (i.i.d. entries with distribu¬ 
tion px), and the replicated symbols X^, a = 1,..., u (i.i.d. 


2a2 


\y-sXu 


Ay 


(94) 
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entries with distribution qx)- Note that S, Xq and Xa are 
independent in Let X = [Xq, ..., X„]. From the fact 
that the L dimensions of the CDMA channel are independent 
and statistically identical, we write ( I94> as 

E{Z^{Y,S)} 


= 


(27rtT^) 


]^exp 


exp 


iy-SAXo)^ 


(95) 


(y - SAX a 


2(t2 


X,X 


dy 


where the inner expectation in (I95> is taken over S = 
, Sk], a vector of i.i.d. random variables each taking 
the same distribution as the random spreading chips Snk- 
Define the following variables; 

1 ^ 

Va = ^/snrt. SkXak, a = 0,l,...,u. 


'^a — ^ ^ v/snrfe SkXak: 

k=l 

Clearly, ( 1^ can be rewritten as 


(96) 


E S)} = E {exp [l (A, X)] } (97) 


where 


G 


P (AX) 


= - I log (Sttct^) + log Je< exp 


n 


a—l 


{y-VPVaY 

2a2 


{y-VPVoY 


(98) 


A,X 


dy 

V^' 


exp 


GPiA,X) 


= exp 




( 100 ) 


where the integral of the Gaussian density in ( I98> can be 
simplified to obtain (refer to [57] for details) 


{Q)= -I logdet(/ + SQ) - i log (l + ^) 


- 2 log (27rcr^) 

where S is a (rt + 1) x (u + 1) matrix:^ 


( 101 ) 


s = 


p 


cr2 + It 


^ -e I {1 + ^)1- -^efZ 

^The indexes of all (ri + 1) X (n + 1) matrices in this paper start from 0. 


where e is a u x 1 column vector whose entries are all 1. 
It is clear that S is invariant if two nonzero indexes are 
interchanged, i.e., S is symmetric in the replicas. 

By (|97j and (fTOOl . 

\iogE{z“(y,s)} 


K 


1 


= jlogE{ 

4 - ^ 


exp 


exp 


L 


4 


(g(“)(Q) 
iQ) 


0{K- 

dyPiQ) 


O 


} (103) 

(L)(104) 


where the expectation over the replicated symbols is rewritten 
as an integral over the probability measure of the correlation 
matrix Q, which is expressed as 


yPiQ) = ^{ n ^ 

0<a<b \k—l 

(105) 

where (5(-) is the Dirac function. Note that the limit in 
K and the expectation can be exchanged from (I103> to 
(tUHl by Lebesgue’s dominated convergence theorem since 
exp[G(“)(g)] is bounded by a function of u independent 
of Q. 

By Cramer’s theorem [62, Theorem II.4.1], the probability 
measure of the empirical means Qab defined by (I99t satisfies, 
as K ^ oo, the large deviations property with some rate 
function Let the moment generating function be 

defined as 

= Ejexp 

where Q is a (u + 1) x (rt + 1) symmetric matrix, X = 
[Xq, Xi,..., X„4 ^nd the expectation in J106t is taken over 
independent random variables snr ~ Pspr, Xq ~ px and 
Xi,..., X„ ~ qx- The rate of the measure pp is given by 
the Legendre-Fenchel transform of the cumulant generating 
function (logarithm of the moment generating function) [62]: 

= sup [trjQQ} -logMl“l(Q)j (107) 

where the supremum is taken with respect to the symmetric 
matrix Q. 

Note the factor K in the exponent in the integral in (fUMli . 
As X —!■ oo, the integral is dominated by the maximum of 
the overall effect of the exponent and the rate of the measure 
on which the integral takes place. Precisely, by Varadhan’s 
theorem [62, Theorem II.7.1], 

lim llogE{Z"(Y,5)} = sup 

K Q IP 

(108) 

where the supremum is over all (symmetric) valid correlation 
matrices. 

By (fToHl . (tToTl and (tTiill . one has 


K 


^ ^ KQab 


snrX 


g^]} 


(106) 


Note that given A and X, each 14 is a sum of K weighted 
i.i.d. random chips. Due to a vector version of the central limit 
theorem, V converges to a zero-mean Gaussian random vector 
as K ^ oo. For a,b = 0,1,... ,u, define 

1 ^ 

gah = E { VaVb \X,X} = -J 2 snrfeX„feXf,fe, ( 99 ) 

fe=i 

Although inexplicit in notation, Qab is a function of 
{snrfc,Xofc,Xbfc}^^. The random vector V in can be 
replaced by a zero-mean Gaussian vector with covariance 
matrix Q. The reader is referred to [23, Appendix B] or 
[57] for a justification of the change through the Edgeworth 
expansion. As a result. 




llogE{Z“(F,5)} 


( 102 ) = 


= sup 
Q 


P 




YQ) - sup 
Q 

= sup inf T(“)(g,g) 

Q Q 


tr 


{gg}-iogM(“)(g) 


(109) 


( 110 ) 
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= Ejexp 
= E < exp 


snr 


2dY^X^Xa + 2f Y, XaXb + cXl + gYK 


0<a<6 
\ 2 




C - y ) snrX^ + (g - /)snr ^ Xl 


(117) 

(118) 


M(")(Q*) = Eiw^ J exp 


<P 


a—1 


-—+ 2Vinr( —Xq + d^Xa)2 + [c- -r jsnrXg + (g - /)snr^X^ 


d2 


a=l 


dzj . 

( 120 ) 


/(“) (Q* ) = rc + upg + 2umd + u(u — l)g/ 


- logE<( / exp 


2 

-y(z-VsnrXo) + csnrXg 


* 

'1 r 


snr > 


Eg { exp [2d'\/inrXz + (g — /)snrX^] | snr} 


dzj. 

( 121 ) 


where 

iQ, Q) = - y logdet(7 + SQ) - tr {qq} 

+ logE|exp smX^QX | (HI) 

For an arbitrary Q, we first seek the point of zero gradient with 
respect to Q and find that for any given Q, the extremum in 
Q satisfies 


Q = 


E}snrXX' exp 


snrX^QX 


exp 


snrX 


'^Qx] } 


Q = -/3-i(7 + SQ)'^S. 


Q= E} snrXX' 


q} 


“The following identities are useful: 


d log det Q 
dx 


= triQ 


dx J ' 


dQ- 

dx 


OX 


Solving joint equations (II 12> and (II 13> directly is pro¬ 
hibitive except in the simplest cases such as qx being Gaus¬ 
sian. In the general case, because of symmetry in the matrix 
S (fTn^ . we postulate that the solution to the joint equations 
satisfies replica symmetry, namely, both Q* and Q are 
invariant if two (nonzero) replica indexes are interchanged. 
In other words, the extremum can be written as 


Q* = 


( 112 ) 


r m 
m p 

m q 


Let Q {Q) denote the solution to (II 12t . We then seek the 
point of zero gradient of r(“l (^Q, Q (Q)^ with respect to 

Q.^ By virtue of the relationship (II 12t . one finds that the 

~ * 

derivative of Q with respect to Q is multiplied by 0 and 
hence inconsequential. Therefore, the extremum in Q satisfies 


Q = 


m 

c 

d 


q 

d 

9 

f 


q 

p 


d 

f 


d f 


/ 


q 

p 

d' 

f 


(115a) 


(115b) 


(113) 


It is interesting to note from the resulting joint equations 
(II 1 2> - (ll 13> that the order in which the supremum and in- 
fimum are taken in dl lOt can be exchanged. The solution 
(Q*:Q*'] is in fact a saddle point of T(“i. Notice that (I112t 
can also be expressed as 


where r,m,p,q,c,d, f,g are some real numbers. Under 
replica symmetry, (fTim is evaluated to obtain 


G(“) (g*) = log {2na^) - ^ log 


1 + -^(p-q) 


-iiog 


fd 


1 + -^{P - q) + + P{r - 2m + q)) 


.(116) 


(114) 


where the expectation is over an appropriately defined condi¬ 
tional Gaussian measure p^ snr|Q- 


The moment generating function (TiO^ is evaluated as dl 17> - 
dl 18> where Xq ~ px while Xa ~ qx are all independent. 
The expectation dl 18li with respect to the symbols can be 
decoupled using the unit area property of Gaussian density:® 


— ^z^ -I-V^tcz dz, Vx,g. (119) 


e =i/y/exp 


^Equation GI3 is a variant of the Hubbai'd-Stratonovich transform [64]. 
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Using ( II 19> with r] = 2<P/f, (II 18> becomes ( fnni . Since 
Xq, ..., Xu and snr are independent, the rate of the measure 
(I107> under replica symmetry is obtained from ( I120> as im). 
Let Q* be the replica-symmetric solution to (II 12> - (ll 13> . The 
free energy is then found by (in and nm -. 


JC- = _ lim A r 
u^o du 


(Q*) - {Q*) 


( 122 ) 


The eight parameters {r,m,p,q,c,d, f,g) that define Q* 
and Q are the solution to the joint equations (fTT71-(fTT71 
under replica symmetry. It is interesting to note that as 
functions of u, the derivative of each of the eight parameters 
with respect to u vanishes as u —*■ 0. Thus for the purpose 
of the free energy (fT22l . it suffices to find the extremum of 
— /(“)] at M = 0. Using dl 13li . it can be shown that 

at u = 0, 


= 0, 

(123a) 

1 


2[cr2 + /3(p - g)] ’ 

(123b) 

1 -f /3(r — 2m -f q) 


2[a2+/3(p-p)P ’ 

(123c) 

= !-d. 

(123d) 


The parameters r,m,p,q can be determined from dl 14t by 
studying the measure under replica symmetry and 

M ^ 0. For that purpose, define two useful parameters: 

2(P 

rj = -j- and ^ = 2d. (124) 

Noticing that c = 0, g — f = —d, dl20> can be written as 


= E 


exp 


{z- V^XoY 


exp 




snr 


dz 


(125) 


It is clear that the limit of dl25t as it —*■ 0 is 1. Hence by 
dl 12li . as It ^ 0, 

Q:, = E{snrX„Xb| Q*} (126) 

^ E {snrX^Xh exp } . (127) 


We now give a useful representation for the parameters 
r,m,p,q defined in dl 15li . Consider for instance a = 0 and 
6=1. Note that as it —*■ 0, 


E < snrXoXi exp 



= E 


snr2fo 




\[i exp 

[z - A/snrXi)^ 


E,j 


-| ( 2 ; - VinrXi)^ 

snr| 


(128) 


Let two single-user Gaussian channels be defined as in Section 
III-UI i-e., Pz\x.sm-r, given by (CB and qz\x,sm;i by ([TT}. 


Assuming that the input distribution to the channel qz\x,sm-,^ 
is qx, a posterior probability distribution qx\z,st\r-(, is induced, 
which defines a retrochannel. Let Xq be the input to the chan¬ 
nel Pz|x, 5 nr;i 7 and X = Xi he the output of the retrochannel 
qx\z,snr,£,- The posterior mean with respect to the measure 
q, denoted by {X}^, is given by dl8> . The Gaussian channel 
Pz\x,sm-,rj^ the retrochannel qx\z,sm-,^ and the PME, all in the 
single-user setting, are depicted in Figure |5(b)| Then, dl28> 
can be understood as an expectation over Xq, X and Z to 
obtain 


QSi = E 

[snrXoXiexp X^Q*x] } 

(129) 


snrXo J Eg{X \ Z = z, snr; 



>^Pz\x.sm;r,{z\Xo,snr;r]) dzj 

(130) 

= E 

[snrXo (X)g| . 

(131) 


Similarly, (tT27l can be evaluated for all indexes (a, 6) yielding 
together with dlL5t : 

^ = <3oo = E {snrXp} = Ejsnr} , (132a) 

m = QSi = E{snrXo(X)^}, (132b) 

p = Qti = EjsnrX^}, (132c) 

q = Ql, = E[sm{{X)^f}. (132d) 

In summary, under replica symmetry, the parameters c, d, /, g 
are given by dna as functions of r, m,p, q, which are in turn 
determined by the statistics of the two channels dH and (113 
parameterized by p = 2d^ /f and ^ = 2d respectively. It is 
not difficult to see that 

r-2m + q = E |snr (133a) 

p-q = Ejsnr (X-(X)J'|. (133b) 

Using (ins and dl 24> . it can be checked that 

r-2m + q = (134a) 

13 \ri ) 

P-, = 03 *) 

Thus G^"^ and given by dl 16li and dl 21l( can be expressed 
in p and Using (ES and lll34K the free energy is found 
as (123, where [r],^) satisfies fixed-point equations 

p"^ = 1 +/3E|snr(^Xo-(X)^^ |, (135a) 

^-1 = a2+/3E|snr(x-(X)J'|. (135b) 

Because of (Cni, in case of multiple solutions to dl35li . (p, 
is chosen as the solution that gives the minimum free energy 
T. By defining £’(snr; p, and V(snr; p, as in dl9> and d20l i. 
the coupled equations dl23> and dl32> can be summarized to 
establish the key fixed-point equations (E). It will be shown 
in Section nMi that, from an individual user’s viewpoint, the 
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multiuser PME and the multiuser retrochannel, parameterized 
by arbitrary {qx^ o'), have an equivalence as a single-user PME 
and a single-user retrochannel. 

Einally, for the purpose of the total spectral efficiency, we set 
the postulated measure q to be identical to the actual measure p 
(i.e., qx = px and cr = 1). The inverse noise variances 
satisfy joint equations but we choose the replica-symmetric 
solution T] = ^ as argued in Section ITl-r)l Using (18Qt . the total 
spectral efficiency is 


Cjoint = - /3 E / Pz|snr;r)(^|snr;77) \ogpz\snr,viz\snr, T]) d z 


p . 2x6 1 

- O log- + log?7), 

2 p 2 


(136) 


where rj satisfies 


rj + r] PE 



1 - 


[pi{z,snr-r])f 

Pz|5nr;r,(^|snr;77) 


1. (137) 


The optimal spectral efficiency of the multiuser channel is thus 
found. 


B. Joint Moments 

Consider again the Gaussian channel, the PME and the 
retrochannel in the multiuser setting depicted in Eigure |5(a)| 
The joint moments are of interest here. Eor simplicity, 
we first study joint moments of the input symbol and the 
retrochannel output, which can be obtained as expectations 
under the replicated system [57, Lemma 3.1]; 

E {xIuH) = E {x^aA} . (138) 

It is then straightforward to calculate (I82t by following the 
same procedure. 

The following lemma allows us to determine the expected 
value of a function of the symbols and their replicas by 
considering a modified partition function akin to (|87}. 

Lemma 1: Given an arbitrary function fiXn, X where 
2La = {Xi,...,Xu], define 

Z^^\y,S,Xo;h) 

= E, I exp[hf{xo,J^)] ^ (?y|x,s(y|^a,5') 

I a^l 

(139) 



where Xa has i.i.d. entries with distribution qx- If 
E { f(Xn. X^] I Y, S, Xq} is not dependent on u, then 


E {/(Xo, X J} = Hm |- log E 5, Xq; 

Proof: It is easy to see that 


h=0 

(140) 


Xo;/r) 


Z'^{Y,S). 


(141) 


By taking the derivative and letting h = 0, the right hand side 
of (Hlfil is 

u 

a—1 

where X^ has the same statistics as X ^ (i.e., contains i.i.d. 
entries with distribution qx) but independent of (Xq, Y, S). 
Also note that 



U 

= Z-“(Y, S) qx^ (X J n 9y|x,s(X I X,, S). 


One can change the expectation over the replicas X '„ indepen¬ 
dent of (Y, S', Xq) to an expectation over X „ conditioned on 
(X, S, Xq). Hence \\A2\ can be further written as 

^ lim E{E{/(Xo,XJ|X,S,Xo} ^“(X,S)} 

= ^E{E{/(Xo,XJ|X,S,Xo}} (144) 

= 1e{/(Xo,XJ} (145) 

where Z'^(Y,S) can be dropped as u ^ 0 in (I144t since 
the conditional expectation is not dependent on u by the 

assumption in the lemma. ■ 

Eor the function /(Xo,X^) to have influence on the free 
energy, it must grow at least linearly with K. Assume that 
f (Xn, X^) involves users 1 through Ki = aiK where 0 < 
ai < 1 is fixed as K ^ oo: 


Ki 

/(Xo,XJ = ^X*,Xi, (146) 

where m is an arbitrary replica number in With¬ 

out loss of generality, we calculate ( I138> for a user k € 
Ki}. It is also assumed that user 1 through Ki take 
the same signal-to-noise ratio snr. We will finally take the 
limit ai ^ 0 so that the equal-power constraint for the first 
Ki users becomes superfluous. 

Clearly, the moments (film for user k can be rewritten as 


1 

E{X*,XiJ = (147) 


fc=i 




E{/(Xo,XJ}. (148) 


Note that 


( 


E{/(Xo,XJ| X,5,Xo} = E J2^okXl 




X,5,Xo 


(149) 

is not dependent on u. By Lemma [J the moments ( I148> can 
be obtained as 
d 1 


u^odh aiK 


logE{z(“)(X,5,Xo;- 




(150) 
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where 

Z^'^\y,S,xo;h) = (2^^)’ 



■ 

- K^ 

E.| 

exp 

^ ^ok^mk 

. k^l 


U 

X exp 

a=l 


- — \\y-SX,f 


(151) 

Regarding (fTsl as a partition function for some random 
system allows the same techniques in Section llV-Al to be used 
to write 


lm^llogE{z(“)(r,5,Xo;/r)} 


= sup 
Q 


(152) 


where G(“1(Q) is given by (llOlt and is the rate 

of the following measure (cf. (I105H 


K 


y^^\Q]h)=El S i'^snrkXakXbk - KQab 


0<a<& \fc=l 


Ki 


(153) 


xexp hJ^XoAk 
_ 

By the large deviations property, one finds the rate 
/(“)(g;/i) =sup JtrjQQ} -logM(“)(Q) 

-ai (logM(“)(Q,snr;/i) -logM(“)(g,snr;0)) 
where is defined in (I106t . and 


(154) 


Taking the limit m ^ 0, one has from J148M158I I that as 
K oo. 


1 iVi „ 

IT Y.^{^okXi,k} ^ / Pziz,snr,r)) 

^ k=i 


qj{z,sm-,C) 

go(^,snr;^) 


(159) 

Let Xq ~ px be the input to the single-user Gaussian channel 
Pz|x,snr;r) Z be its output (see Figure |5(b^ . Let X be the 
corresponding output of the companion retrochannel with Z 
as its input. Then Xq-Z-X is a Markov chain. By definition 
of Pi and Qi, the right hand side of J159t is 


J po(z,snr; p) 


p,(z,snr;g) qjjz,snr;^) 
Po(z,snr;^) qo{z,snr-^) 


= E{E{X^, \ Z}E{X^\ Z}}. 


(160) 


Letting Ki 1 (thus a\ 0) so that the requirement that 
the first Ki users take the same SNR becomes unnecessary, 
we have proved by (fH^ . (EtJ, and (ITbOj that for every 
SNR distribution and every user fc e {1,..., iT} 


E{xikXi}^E{XlX^} asiT^cx). (161) 

Since the moments (dij are uniformly bounded, the dis¬ 
tribution is thus uniquely determined by the moments due to 
Carleman’s Theorem [65, p. 227]. Therefore, for every user k, 
the joint distribution of the input Xok to the multiuser channel 
and the output Xk of the multiuser retrochannel converges 
to the joint distribution of the input Xq to the single-user 
Gaussian channel Pz\x,snr;ri the output X of the single- 
user retrochannel qx\z,sm-i- 

Applying the same methodology as developed thus far in 
this subsection, one can also calculate the joint moments 
by letting 




= E 


I exp [fi exp snrX^gX snr|. 


(155) 


From JL52t and (IL54t . taking the derivative in (IL50t with 
respect to at = 0 leaves only one term 


^logM(“)(Q,snr; h) 


■{X^o 


26^ exp 


h—0 

snrX^‘ 




E I exp 


snrX^QX 


} 


Since 


h—0 


(156) 


Z^^'>iY,S,Xo;h) =Z^{Y,S), (157) 


the Q in JL56t that give the supremum in JL54t at ^ 0 
is exactly the Q that gives the supremum of (flOTl . which is 
replica-symmetric by assumption. By introducing the param¬ 
eters (p,^) the same as in Section HV-AI and by definition of 
qi and pi in <24> and (I27l i respectively, J156I I can be further 
evaluated as 


/ /- \ ^ 

/ (v Fi(^>snr;?7)g^"\z,snr;^)qj(2;,snr;,f)dz 


/(y^e '2 ^ po{z,snr,r])q^{z,snr,^) 


dz 


(158) 


Ki I 

= (162) 

fc=l a=l 

where it is assumed that m > 1. The rationale is that Xq- 
{Y, S)~Xa is a Markov chain and XaS are i.i.d. conditioned 
on (Y, S'); hence J82> can be calculated as expectations under 
the replicated system; 

^{XokXl {Xk)\} 

= Ejx'fcX^, nE{X,fe|X,S}| (163) 

= E{/(Xo,XJ}. (164) 

It is straightforward by Lemma^to calculate (I164t and obtain 
that, as K ^ 00 , 


E{fiXo,2La)} 


p^{z, snr; rj) 


qj{z,snr;^) 

qo{z,snr;^) 


\qo{z,snr-QJ 


(165) 


Let (X)^ be the single-user PME output as seen in Figure 
|5(b)| which is a function of the Gaussian channel output Z. 
Then the right hand side of (fTBl represents a joint moment 
and thus 


E {X*, Xi (Xfc)'} ^ E {X*X^ (X)'} . (166) 
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Again, by Carleman’s Theorem, the joint distributions of 
{Xok,Xk, {Xk)g) converge to that of (Xo,X, {X)^). Indeed, 
from the viewpoint of user k, the multiuser setting is equivalent 
to the single-user setting in which the SNR suffers a degrada¬ 
tion r] (compare Figures |5(b)| and |5(a)t . Hence we have proved 
the decoupling principle and Claim [Q 

In the large-system limit, the transformation from the input 
Xgk to the multiuser detection output {Xk}^ is nothing but 
a single-user Gaussian channel Pz\x,snr-,r) concatenated with a 
decision function ( I23> . The decision function can be ignored 
from both detection- and information-theoretic viewpoints due 
to its monotonicity: 

Proposition 1: The decision function (|23 is strictly mono¬ 
tone increasing in z for all snr and 

Proof: Let (•)' denote derivative with respect to z. One 
can show that for i = 0 , 1 ,..., 

= .^Vsnr snr; ,f) - ^z g,(z, snr; ^). (167) 


Clearly, 


' gi(z,snr;^) 1 
.90(2:, snr; ^)_ 

^ .— 92(2;, snr; ^90(2;, snr; - (j?(z, snr; 

= - 


(168) 


The numerator in J168t is nonnegative by the Cauchy- 
Schwartz inequality. For the numerator in J168t to be 0, X 
must be a constant, which contradicts the assumption that X 
has zero mean and unit variance. Therefore, (|23 is strictly 
increasing. ■ 

We may now conclude that the equivalent single-user chan¬ 
nel is an additive Gaussian noise channel with input signal-to- 
noise ratio snr and noise variance 77 “^ as depicted in Figure 
|5(b)| Corollaries [2 and |2] are thus proved. In the special case 
that the postulated measure q is identical to the actual measure 
p. Claim [2 reduces to Claim |2] 

The single-user mutual information is now simply that of a 
Gaussian channel with input distribution px. 


/(psnr) =-J Pz\sm;ri{z\snr,r]) \ogpz\sm-riiz\snr;T]) dz 

1 , 27re 

- o log-, 

2 r] 

(169) 

which is as defined in (I 23 . The overall spectral efficiency 
under separate decoding is 


Csep =/3E{7(9snr)} . (170) 

Hence the proof of (I26> . Claim|3is proved by comparing (I170t 
to (tllbl . 


V. Complex-valued Channels 

Until now the discussion is based on a real-valued setting 
of the multiuser system, namely, both the inputs Xk and the 
spreading chips Snk take real values. In practice, particularly 
in carrier-modulated communications where spectral efficiency 
is a major concern, transmission in the complex domain must 
be addressed. Either the input symbols or the spreading chips 


or both can take values in the complex number set. In the 
complex-valued setting, the channel model 0 is equivalent to 
the following real-valued one: 


■yW 


S® 

-S®' 


■JA®' 


■jV®' 

y(i) 


5® 

gir) 


Jf(i) 

+ 



where the superscripts (r) and (i) denote real and imaginary 
components respectively. Note that the previous analysis does 
not apply to J171> since the entries of the channel state matrix 
are not i.i.d. in this case. 

If the inputs take complex values but the spreading is real¬ 
valued (S'*‘^ = 0), the channel can be regarded as two uses of 
the real-valued channel S = where the inputs and 
to the two channels may be dependent. Since independent 
inputs maximize the channel capacity, there is little reason to 
transmit dependent signals in the two subchannels. Thus the 
analysis of the real-valued channel in previous sections also 
applies to the case of independent in-phase and quadrature 
components, while the only change is that the spectral effi¬ 
ciency is the sum of that of the two subchannels. 

We can also compare the real-valued and the complex¬ 
valued channels assuming the same real-valued input distri¬ 
bution. Under the complex-valued channel. 


■y(r)' 

y(i) 


5®' 

5® 


X + 


'N^d- 


(172) 


which is equivalent to transmitting the same real-valued X 
twice over the two component real-valued channels. This is 
equivalent to having a real-valued channel with the load [3 
halved. 

If both the symbols and the spreading chips are complex¬ 
valued, the analysis in the previous sections can be modified 
to take this into account. For convenience it is assumed that 
the real and imaginary components of spreading chips, 5^^, 
are i.i.d. with zero mean and unit variance. The noise 
vector has i.i.d. circularly symmetric Gaussian entries, i.e., 
E{7V7V**} = 21. Thus the conditional probability density 
function of the actual multiuser channel is 


PY\xAy\x, S) = (27r) ^ exp 


Wv-SxW^ ' 

2 


(173) 


whereas that of the postulated channel is 


gYix,s(ylx,S) = (2x0^) ^exp 


\y-Sxf 


2^2 


(174) 


Also, the actual and the postulated input distributions px 
and qx have both zero-mean and unit variance, E{|Ai|2} = 
Eg{|Ar|2} = 1. Note that the in-phase and the quadrature 
components are intertwined due to complex spreading. 

The replica analysis can be carried out in parallel to that 
in Section EYl In the following we highlight the major differ¬ 
ences. Given {A,J£_), the variables 14 defined in ( I96> have 
asymptotically independent real and imaginary components. 
Thus, can be evaluated to be twice that under real-valued 
channels with 

1 ^ 

Qab = a,b = 0,...,u. (175) 
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The rate 7^“^ of the measure of Q is obtained 


{qq} - log, E {exp [snrX'‘Qx] } 


/(“)(Q)=sup tr 
Q ^ 

(176) 

As a result, the fixed-point joint equations for Q and Q are 


Q 

Q 




E{snrXX"exp snrX"Qxj | 


E{exp snrX"Qx]} 


(177a) 

(177b) 


Under replica symmetry (II 15> . the parameters (c, d, f, g) are 
found to be 2 times the corresponding values given in dna, 
and {r,m,p,q) are found the same as in (I132> except that 
all squares are replaced by squared norms. By defining two 
parameters (which differ from (fT24l by a factor of 2 ): 

77 = y and C = 7, (178) 


we have the following result. 

Claim 5: Let the multiuser posterior mean estimate of the 
complex-valued multiple-access channel (I173> with complex¬ 
valued spreading be parameterized by a postulated input 
distribution qx and noise level a. Then, in the large-system 
limit, the distribution of the multiuser detection output {Xk)^ 
conditioned on Xk = x being transmitted with signal-to-noise 
ratio sntfe is identical to the distribution of the estimate {X)^ 
of a single-user complex Gaussian channel 

Z = ^rX + (179) 

Vd 

conditioned on = a: being transmitted with snr = sntfe, 
where N is circularly symmetric Gaussian with unit variance, 
E{|A^P} = 1. The multiuser efficiency p and the inverse 
noise variance ^ of the postulated single-user channel dni 
satisfy the coupled equations (ED, where the mean-square 
error £ (snr; 77 ,^) of the posterior mean estimate and the 
variance V(snr; 77 ,^) of the retrochannel are defined similarly 
as that of the real-valued channel, with the squares in (I19t and 
(I20t replaced by squared norms. In case of multiple solutions 
to (I21t . are chosen to minimize the free energy: 


- E { Jpz\snr,r,{z\5nr; 7]) log qz\surA^\snr; dz 

+ 4[(^- l)loge-log^] -f log - - -loge (180) 

P 7T 7] 

+ ^ log e -f i log(27r) -f log e. 

Pn p pv , _, 

Corollary 3: For the complex-valued channel (I173> . the 
mutual information of the single-user channel seen at the 
multiuser posterior mean estimator output for a user with 
signal-to-noise ratio snr takes the same formula as (|25}: 

7(77Snr) = D {pz\X,snr,T,\\pz\snr,r,\px) ■ (181) 

where 77 is the multiuser efficiency given by Claim |3 and 
Pz|snr ;77 Is the marginal probability distribution of the output of 
channel The overall spectral efficiency under suboptimal 

separate decoding is Csep(/3) = /3 E { 7(77 snr)} . 


Claim 6: The optimal spectral efficiency under joint decod¬ 
ing is 

Cjoint(/3) =/ 3 E{ 7 ( 77 snr)}-b ( 77 - 1 ) loge-log 77 , (182) 

where 77 is the optimal multiuser efficiency determined by 
Claim IDby postulating a measure q that is identical to p. 

It is interesting to compare the performance of the real¬ 
valued channel and that of the complex-valued channel. We 
assume the in-phase and quadrature components of the input 
symbols are independent with identical distribution p'^ which 
has a variance of i. By Claim the equivalent single- 
user channel ( I179> can also be regarded as two independent 
subchannels. The mean-square error and the variance in (ED 
are the sum of those of the subchannels. It can be checked 
that the performance of each subchannel is identical to that of 
the real-valued channel with input distribution p'^ normalized 
to unit variance. Note, however, that the total transmit energy 
in case of complex spreading take twice the energy of their 
real counterparts. In all, the error performance under complex¬ 
valued spreading is exactly the same as those under real-valued 
spreading. This result simplifies the analysis of complex¬ 
valued channels such as those arise in multiantenna systems. 
If we have control over the channel state matrix, as in CDMA 
systems, complex-valued spreading should be avoided due to 
higher complexity with no direct performance gain. 


VI. Numerical Results 


Figures plot the simulated distribution of the posterior 
mean estimate and its corresponding “hidden” Gaussian statis¬ 
tic. Equal-power users with binary input are considered. We 
simulate CDMA systems of 4, 8 , 12 and 16 users respectively. 
The load is fixed to /3 = 2/3 and the SNR is 2 dB. Let Xk = 1 
be transmitted by all users. We collect the output decision 
statistics of the posterior mean estimator (i.e., the soft output 
of the individually optimal detector, {Xk)) out of 1000 trials. 
A histogram of the statistic is obtained and then scaled to plot 
an estimate of the probability density function in Figure 0 We 
also apply the inverse nonlinear decision function to recover 
the “hidden” Gaussian decision statistic (normalized so that 
its conditional mean is equal to Xk = 1 ), which in this case 
is 


~ _ tanh ^{{Xk)) 


(183) 


77snrfc 


The probability density function of Zk estimated from its 
histogram is then compared to the theoretically predicted 
Gaussian density function in Figure |3 It is clear that even 
though the PME output {Xk) takes a non-Gaussian distri¬ 
bution, the equivalent statistic Zk converges to a Gaussian 
distribution centered at Xk as K becomes large. This result is 
particularly desirable considering that the “fit” to the Gaussian 
distribution is quite good even for a system with merely 8 
users. 

In Eiguresl^lOl multiuser efficiency and spectral efficiency 
are plotted as functions of the average SNR. We consider 
three input distributions, namely, QPSK, 8 PSK, and complex 
Gaussian inputs. Complex-valued spreading is assumed, where 
the multiuser efficiency and the spectral efficiency are given 
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Fig. 7. The empirical probability density functions of the posterior mean 
estimates with binary input conditioned on “+1” being transmitted. 



Hidden decision vaiue 



Fig. 8. The empirical probability density functions of the “hidden” Gaus¬ 
sian statistic with binary input conditioned on “+1” being transmitted. The 
asymptotic Gaussian distribution predicted by the decoupling principle is also 
plotted for comparison. 


by Claim and Corollary 0 respectively. We also consider 
two SNR distributions: 1) identical SNRs for all users (perfect 
power control), and 2) two groups of users of equal population 
with a power difference of 10 dB. We first assume a system 
load of /3 = 1 and then redo the experiments with /3 = 3. 

In Figure |9(a^ multiuser efficiency under complex Gaussian 
inputs and linear MMSE detection is plotted as a function of 
the average SNR. The load is (3 = 1. We find the multiuser 
efficiencies decrease from 1 to 0 as the SNR increases. The 
monotonicity can be easily verified by inspecting the Tse- 
Hanly equation Transmission with unbalanced power 

improves the multiuser efficiency. The corresponding spectral 
efficiencies of the system are plotted in Figure |9(b)| Both joint 
decoding and separate decoding are considered. The gain in 


the spectral efficiency due to joint decoding is small for low 
SNR but significant for high SNR. Unbalanced SNR reduces 
the spectral efficiency, where under separate decoding the loss 
is almost negligible. 

Multiuser efficiency with QPSK inputs and nonlinear 
MMSE (individually optimal) detection is plotted in Eigure 
|9(c)| Note that this function is not monotonic: it converges 
to 1 for both vanishing SNR and infinite SNR. While for 
vanishing SNR this follows directly from the definition of 
multiuser efficiency, the convergence to unity as the SNR goes 
to infinity was shown in [66] for the case of binary inputs. A 
single dip is observed for the case of identical SNRs while two 
dips are observed in the case of two SNRs of equal population 
with 10 dB difference in SNR (the gap is about 10 dB). 
The corresponding spectral efficiencies are plotted in Eigure 
|9(d)| The spectral efficiencies saturate to 1 bit/s/dimension at 
high SNR. The difference between joint decoding and separate 
decoding is quite small for both very low and very high SNRs 
while it can be 30% at around 6 dB. 

Multiuser efficiency under 8PSK inputs and nonlinear 
MMSE detection is plotted in Eigure |9(e)| The multiuser 
efficiency curve is slightly better than that for QPSK inputs. 
The corresponding spectral efficiencies are plotted in Eigure 
|9(f)| The spectral efficiencies saturate to 3 bit/s/dimension at 
hight SNR. 

In Eigure ^1 we redo the previous experiments only with 
a different system load /3 = 3. The results are to be compared 
with those in Eigure 

Eor Gaussian inputs, the multiuser efficiency curves in 
Eigure |10(a)| take a similar shape as in Eigure |9(a)| but are 
significantly lower due to higher load. The corresponding 
spectral efficiencies are shown in Eigure |10(b)| It is clear 
that higher load results in higher spectrum usage under joint 
decoding. Separate decoding, however, is interference limited 
and the spectral efficiency saturates under high SNR (cf. [10, 
Eigure 1]). 

Eigure [T()(c)| plots multiuser efficiency under QPSK inputs. 
All solutions to the fixed-point equation (EU of the multiuser 
efficiency are shown. Under equal SNR, multiple solutions 
coexist for an average SNR of 10 dB or higher. If two groups 
of users with 10 dB difference in SNR, multiple solutions are 
seen between 11 to 13 dB. The solution that minimizes the 
free energy is valid and is shown in solid lines, while invalid 
solutions are plotted using dotted lines. An almost 0 to 1 jump 
is observed under equal SNR and a much smaller jump is seen 
under unbalanced SNRs. This is known as phase transition. 
The asymptotics under equal SNR can be shown by taking 
the limit snr ^ oo in (|63. Essentially, if psnr ^ oo, then 
ry ^ 1; while if rysnr t where r is the solution to 

r / — e 2 [1 — tanh(r — z-^/r)] dz = —, (184) 

J V 27r P 

then ry ^ 0. If /3 > 2.085, there exists a solution to (I184t so 
that two solutions coexist for large SNR. 

The spectral efficiency under QPSK inputs and /3 = 3 is 
shown in Eigure |10(d)| As a result of phase transition, one 
observes a jump to saturation in the spectral efficiency under 
equal-power inputs. The gain due to joint decoding can be 
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(a) (b) 




(c) (d) 




(e) 


(f) 


Fig. 9. Multiuser efficiency and spectral efficiency as functions of SNR. The load is /3 = 1. (a) Multiuser efficiency, complex Gaussian inputs, (b) Spectral 
efficiency, complex Gaussian inputs, (c) Multiuser efficiency, QPSK inputs, (d) Spectral efficiency, QPSK inputs, (e) Multiuser efficiency, 8PSK inputs, (f) 
Snpctral pfficipnrv SP.SK inniits 
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Fig. 10. Multiuser efficiency and spectral efficiency as functions of SNR. The load is /3 = 3. (a) Multiuser efficiency, complex Gaussian inputs, (b) Spectral 
efficiency, complex Gaussian inputs, (c) Multiuser efficiency, QPSK inputs, (d) Spectral efficiency, QPSK inputs, (e) Multiuser efficiency, 8PSK inputs, (f) 
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significant in moderate SNRs. In case of two groups of users 
with 10 dB difference in SNR, the spectral efficiency curve 
also shows one jump and the loss due to separate decoding is 
reduced significantly for a small window of SNRs around the 
areas of phase transition (11-13 dB). Therefore, perfect power 
control may not be the best strategy in such cases. 

Under 8PSK inputs, the multiuser efficiency and spectral ef¬ 
ficiency curves in Figure [T0(e)| and fl 0(f)l take similar shape as 
the curves under QPSK inputs. Phase transition causes jumps 
in both the multiuser efficiency and the spectral efficiency. 

In Figures [l0(d)| and |10(f)| a sharp bend upward is observed 
at the point of phase transition. This is known as “spinodal” 
in statistical physics. 

A comparison of Figures |10(b)| |10(d)| and |10(f)| shows 
that under separate decoding, the spectral efficiency under 
Gaussian inputs saturates well below that of QPSK and 8PSK 
inputs. 

VII. Conclusion 

The main contribution of this paper is a simple characteriza¬ 
tion of the performance of CDMA multiuser detection under 
arbitrary input distribution and SNR (and/or flat fading) in 
the large-system limit. A broad family of multiuser detectors 
is studied under the name of posterior mean estimators, 
which includes well-known detectors such as the matched 
filter, decorrelator, linear MMSE detector, maximum likeli¬ 
hood (jointly optimal) detector, and the individually optimal 
detector. 

A key conclusion is the decoupling of a Gaussian multiuser 
channel concatenated with a generic multiuser detector front 
end. It is found that the multiuser detection output for each 
user is a deterministic function of a hidden Gaussian statistic 
centered at the transmitted symbol. Hence the single-user 
channel seen at the multiuser detection output is equivalent 
to a Gaussian channel in which the overall effect of multiple- 
access interference is a degradation in the effective signal- 
to-interference ratio. The degradation factor, known as the 
multiuser efficiency, is the solution to a pair of coupled fixed- 
point equations, and can be easily computed numerically if 
not analytically. 

Another set of results, tightly related to the decoupling 
principle, lead to general formulas for the large-system spectral 
efficiency of multiuser channels expressed in terms of the 
multiuser efficiency, both under joint and separate decoding. It 
is found that the decomposition of optimum spectral efficiency 
as a sum of single-user efficiencies and a joint decoding gain 
applies under more general conditions than shown in [11], 
thereby validating Muller’s conjecture [25]. A relationship 
between the spectral efficiencies under joint and separate 
decoding is one of the applications of a recent basic formula 
that links the mutual information and the MMSE [41]. 

Erom a practical viewpoint, this paper presents new results 
on the efficiency of CDMA communication under arbitrary 
input signaling such as m-PSK and m-QAM with an arbitrary 
power profile. More importantly, the results in this paper allow 
the performance of multiuser detection to be characterized by 
a single parameter, the multiuser efficiency. The efficiency 


of spectrum usage is also easily quantified by means of 
this parameter. Thus, the results offer convenient performance 
measures and valuable insights in the design and analysis of 
CDMA systems, e.g., in power control [67]. 

The linear system in our study also models multiple-input 
multiple-output channels under various circumstances. The 
results can thus be used to evaluate the output SNR or spec¬ 
tral efficiency of high-dimensional MIMO channels (such as 
multiple-antenna systems) with arbitrary signaling and various 
detection techniques. Some of the results in this paper have 
been generalized to MIMO channels with spatial correlation 
at both transmitter and receiver sides [68]. 
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